Wikitech
labswiki
https://wikitech.wikimedia.org/wiki/Main_Page
MediaWiki 1.47.0-wmf.7
first-letter
Media
Special
Talk
User
User talk
Wikitech
Wikitech talk
File
File talk
MediaWiki
MediaWiki talk
Template
Template talk
Help
Help talk
Category
Category talk
Obsolete
Obsolete talk
OfficeIT
OfficeIT talk
Tool
Tool talk
Nova Resource
Nova Resource Talk
Heira
Heira Talk
TimedText
TimedText talk
Module
Module talk
Deployments
0
4108
2428866
2428851
2026-06-21T08:06:10Z
ScheduleDeploymentBot
37566
Add [[gerrit:1304690]] to Monday, June 22 UTC afternoon backport window
2428866
wikitext
text/x-wiki
{{Navigation MediaWiki deployment}}
This page tracks '''upcoming''' '''deployments''' of software to the [[:m:Special:SiteMatrix|Wikimedia Foundation servers]].
== Getting started ==
Ensure you joined the {{irc|wikimedia-operations}} IRC channel as all deployment-related communications happen there.
If you need help, contact [[:mw:Wikimedia Release Engineering Team|Release Engineering]] on IRC at {{irc|wikimedia-releng}}; and ping Tyler (<code>thcipriani</code>).
* '''MediaWiki is deployed weekly''' through the [[/Train|Deployment Train]]. Other services follow their own schedule.
* '''Times are pinned to San Francisco''', thus the UTC time changes in March and November per [[:en:Daylight saving time in the United States|DST]].
* '''Prefer regular [[Backport windows]]''' over adding new windows. To request deployment of a config change or backport, add your username and Gerrit URL to one of the backport windows on this page. You must be online in #wikimedia-operations on IRC during your deployment and install [[WikimediaDebug]] ahead of time. The #wikimedia-operations channel requires you to [[:m:IRC/Instructions#Register your nickname, identify, and enforce|register your nickname]] before you can join.
** You can use the '''backport scheduling tool''' to more easily edit this page: <div style="text-align: center; margin: 1em 0">{{Clickable button 2|:toollabs:schedule-deployment|Schedule a backport|class=mw-ui-progressive}}</div>
* Tasks that meet [[/Inclusion criteria|Inclusion criteria]] '''require their own windows''', which includes long-running tasks. '''Schedule more time''' than you think you need to account for delays and set backs, we recommend one hour for most tasks.
**To create or modify a recurring deploy window, send a patchset to [[:gitlab:repos/releng/release/-/blob/main/make-deployment-calendar/deployments-calendar.yaml|deployments-calendar.yaml file]] in <code>repos/releng/release.git</code>.
**To create an one-off window, simply edit this page accordingly
** '''Announce''' changes to the [[mail:ops|ops mailing list]] ahead of time if you anticipate or are uncertain about noticeable impacts to database load, HTTP caching, or the introduction of new cookies.
** '''Announce''' deployments of major features to the community via [[:m:Tech/News/Next|Tech News]] and/or via other [[:mw:Wikimedia_Product_Guidance/Communication_channels|Product communication channels]].
* '''Something went wrong?''' See [[Incident response]]. Is there a user-impacting problem? Communicate in the {{irc|wikimedia-operations}} IRC channel. If there is a Phabricator task, ensure [[:phab:tag/wikimedia-incident/|#Wikimedia-Incident]] is tagged, and consider setting the [[:mw:Phabricator/Project_management#Priority_levels|Unbreak Now]] priority.
__TOC__
{{anchor|Next Week|Near Term|Near term|Near-term}}{{clear}}
[[Category:Deployment]]
{{Note|content=Subscribe in Google Calendar via <code>wikimedia.org_rudis09ii2mm5fk4hgdjeh1u64@group.calendar.google.com</code>.<br>This may not include one-off windows. '''If there are differences, then the wiki page is canonical and correct'''.}}
==Week of June 22==
==={{Deployment_day|date=2026-06-21}}===
{{Deployment calendar event card
|when=2026-06-21 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-22}}===
{{Deployment calendar event card
|when=2026-06-22 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|dcausse|dcausse}}
{{deploy|type=1.47.0-wmf.7|gerrit=1304565|title=ttmserver-export: pass source language for translation batch IDs|status=}} - {{phabricator|T429479}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-22 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-22 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|manfredi|manfredi}}
{{deploy|type=config|gerrit=1304122|title=config: Enable EmailConfirmationBanner on all wikis|status=}} - {{phabricator|T428292}}
{{deploy|type=1.47.0-wmf.7|gerrit=1304125|title=Add email confirmation banner Test Kitchen instrumentation (long-term)|status=}} - {{phabricator|T428293}}
{{ircnick|tgr|Gergő}}
{{deploy|type=1.47.0-wmf.7|gerrit=1304690|title=Preserve redoLocalAuthentication flag when returning from auth domain|status=}} - {{phabricator|T429495}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-22 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-22 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-22 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-22 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-22 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|RoanKattouw|RoanKattouw}}
{{deploy|type=config|gerrit=1304156|title=Permissions: Create wmf-officeit group on collabwiki|status=}}
{{ircnick|bpirkle|bpirkle}}
{{deploy|type=config|gerrit=1304175|title=REST: adjust analytics and wikifunctions REST Sandbox visibility|status=}} - {{phabricator|T422770}} {{phabricator|T423058}} {{phabricator|T422771}}
{{ircnick|Sohom_Datta|Sohom}}
{{deploy|type=config|gerrit=1304630|title=Add source tab to ukwikisource's "Архів" (Archive) namespace|status=}} - {{phabricator|T53980}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-22 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-22 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-22 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.8</code>
}}
{{Deployment calendar event card
|when=2026-06-22 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.8</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-22 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-22 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-22 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-23}}===
{{Deployment calendar event card
|when=2026-06-23 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-23 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-23 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-23 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-23 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-23 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-23 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-23 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-23 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-23 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|brennen|Brennen}}, {{ircnick|jeena|Jeena}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.7->1.47.0-wmf.8|1.47.0-wmf.7|1.47.0-wmf.7}}
* group0 to [[mw:MediaWiki_1.47/wmf.8|1.47.0-wmf.8]]
* '''Blockers: {{phabricator|T423917}}'''
}}
{{Deployment calendar event card
|when=2026-06-23 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-23 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-23 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-24}}===
{{Deployment calendar event card
|when=2026-06-24 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-24 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-24 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-06-24 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-24 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-24 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-24 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-24 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|brennen|Brennen}}, {{ircnick|jeena|Jeena}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.8|1.47.0-wmf.7->1.47.0-wmf.8|1.47.0-wmf.7}}
* group1 to [[mw:MediaWiki_1.47/wmf.8|1.47.0-wmf.8]]
* '''Blockers: {{phabricator|T423917}}'''
}}
{{Deployment calendar event card
|when=2026-06-24 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-24 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-24 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-24 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-24 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-25}}===
{{Deployment calendar event card
|when=2026-06-25 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-25 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-25 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-25 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-25 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-25 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|brennen|Brennen}}, {{ircnick|jeena|Jeena}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-06-25 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-25 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-25 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-25 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|brennen|Brennen}}, {{ircnick|jeena|Jeena}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.8|1.47.0-wmf.8|1.47.0-wmf.7->1.47.0-wmf.8}}
* group2 to [[mw:MediaWiki_1.47/wmf.8|1.47.0-wmf.8]]
* '''Blockers: {{phabricator|T423917}}'''
}}
{{Deployment calendar event card
|when=2026-06-25 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-25 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-25 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-26}}===
{{Deployment calendar event card
|when=2026-06-26 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-06-26 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-06-27}}===
{{Deployment calendar event card
|when=2026-06-27 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==Week of June 29==
==={{Deployment_day|date=2026-06-28}}===
{{Deployment calendar event card
|when=2026-06-28 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-29}}===
{{Deployment calendar event card
|when=2026-06-29 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-29 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-29 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-29 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-29 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-29 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-29 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-29 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-29 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-29 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-29 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.9</code>
}}
{{Deployment calendar event card
|when=2026-06-29 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.9</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-29 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-29 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-29 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-30}}===
{{Deployment calendar event card
|when=2026-06-30 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-30 01:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.8->1.47.0-wmf.9|1.47.0-wmf.8|1.47.0-wmf.8}}
* group0 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-06-30 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-30 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-30 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-30 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-30 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-30 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-30 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-30 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-30 11:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version (secondary timeslot)
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.8->1.47.0-wmf.9|1.47.0-wmf.8|1.47.0-wmf.8}}
* group0 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-06-30 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-30 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-30 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-07-01}}===
{{Deployment calendar event card
|when=2026-07-01 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-01 01:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.9|1.47.0-wmf.8->1.47.0-wmf.9|1.47.0-wmf.8}}
* group1 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-07-01 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-07-01 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-07-01 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-01 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-07-01 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-07-01 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-07-01 11:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version (secondary timeslot)
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.9|1.47.0-wmf.8->1.47.0-wmf.9|1.47.0-wmf.8}}
* group1 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-07-01 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-01 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-07-01 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-07-01 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-07-01 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-07-02}}===
{{Deployment calendar event card
|when=2026-07-02 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-02 01:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.9|1.47.0-wmf.9|1.47.0-wmf.8->1.47.0-wmf.9}}
* group2 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-07-02 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-07-02 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-07-02 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-02 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-07-02 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-07-02 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-07-02 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-07-02 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-07-02 11:00 SF
|length=2
|window=MediaWiki train - Utc-0+Utc-7 Version (secondary timeslot)
|who={{ircnick|andre|Andre}}, {{ircnick|brennen|Brennen}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.9|1.47.0-wmf.9|1.47.0-wmf.8->1.47.0-wmf.9}}
* group2 to [[mw:MediaWiki_1.47/wmf.9|1.47.0-wmf.9]]
* '''Blockers: {{phabricator|T423918}}'''
}}
{{Deployment calendar event card
|when=2026-07-02 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-07-02 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-07-02 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-07-03}}===
{{Deployment calendar event card
|when=2026-07-03 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-07-03 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-07-04}}===
{{Deployment calendar event card
|when=2026-07-04 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
9n5jymoji3tm1furxhv9xfon580kso8
Server Admin Log
0
7919
2428857
2428854
2026-06-20T13:31:03Z
Stashbot
7414
arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
2428857
wikitext
text/x-wiki
== 2026-06-20 ==
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
8dtbnrum7zkudjs6p2e3vxppfszuyms
2428858
2428857
2026-06-20T13:31:32Z
Stashbot
7414
arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
2428858
wikitext
text/x-wiki
== 2026-06-20 ==
* 13:31 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ijhhvray1k3jnnnx32fl1220560yb2j
2428859
2428858
2026-06-20T13:31:34Z
Stashbot
7414
arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
2428859
wikitext
text/x-wiki
== 2026-06-20 ==
* 13:31 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ris28oiosfp6c1colncczadr83rikri
2428860
2428859
2026-06-20T13:32:03Z
Stashbot
7414
arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
2428860
wikitext
text/x-wiki
== 2026-06-20 ==
* 13:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
no0jgqj5yfm0ox2vhm8eii6pqu1o0jb
2428864
2428860
2026-06-21T02:00:23Z
Stashbot
7414
mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2428864
wikitext
text/x-wiki
== 2026-06-21 ==
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-20 ==
* 13:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
jmzxbnffxgep9jg89jif45mhwymx2te
2428865
2428864
2026-06-21T02:01:36Z
Stashbot
7414
mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 12s)
2428865
wikitext
text/x-wiki
== 2026-06-21 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-20 ==
* 13:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-19 ==
* 19:21 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] (duration: 80m 14s)
* 19:17 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:03 krinkle@deploy1003: krinkle: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:01 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1303006{{!}}Disable ShortUrl on remaining wikis (T107188)]]
* 16:22 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 16:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet
* 16:08 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 16:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1023.eqiad.wmnet
* 16:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet
* 15:45 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1021.eqiad.wmnet
* 15:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 15:34 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2004.codfw.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2003.codfw.wmnet
* 15:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2002.codfw.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2009.codfw.wmnet with OS trixie
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2009.codfw.wmnet with reason: host reimage
* 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 13:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs2001.codfw.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1003.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1002.eqiad.wmnet
* 12:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs1001.eqiad.wmnet
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet
* 12:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1022.eqiad.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 12:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 12:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 12:10 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 12:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 12:05 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway.php --wiki=viwiki # [[phab:T409170|T409170]]
* 12:04 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:MigrateMentorStatusAway --wiki=viwiki # [[phab:T409170|T409170]]
* 11:33 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:38 moritzm: imported nodejs 24.17.0-1nodesource1 to thirdparty/node24 for trixie-wikimedia
* 10:37 moritzm: imported nodejs 22.23.0-1nodesource1 to thirdparty/node22 for trixie-wikimedia
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 10:33 btullis@puppetserver1001: conftool action : set/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 10:29 sergi0: Run `MigrateMentorStatusAway` script for all wikis in growthexperiments dblist - [[phab:T409170|T409170]]
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 10:04 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1023
* 10:03 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1021
* 10:03 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1021
* 10:00 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1024
* 09:59 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1024
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1022
* 09:57 btullis@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1022
* 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.network.host-bgp (exit_code=0) for host dse-k8s-worker1020
* 09:54 cmooney@cumin1003: START - Cookbook sre.network.host-bgp for host dse-k8s-worker1020
* 09:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1020.eqiad.wmnet
* 07:32 slyngs: Update IDP/SSO to CAS v7.3.7.3
* 07:31 slyngshede@dns1004: END - running authdns-update
* 07:30 slyngshede@dns1004: START - running authdns-update
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: sync
* 01:18 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: sync
* 01:17 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: sync
* 01:06 ottomata: roll restart eventgate-analytics to pick up stream config change - [[phab:T427787|T427787]]
== 2026-06-18 ==
* 23:46 Amir1: ALTER TABLE reading_list_project AUTO_INCREMENT = 882; on wikishared on x1 master ([[phab:T428002|T428002]])
* 23:34 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 45s)
* 23:33 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:28 rzl@deploy1003: Finished deploy [docker-pkg/deploy@f030aed]: (no justification provided) (duration: 00m 26s)
* 23:27 rzl@deploy1003: Started deploy [docker-pkg/deploy@f030aed]: (no justification provided)
* 23:03 rzl: rzl@apt1002:~$ sudo -i reprepro -C main include trixie-wikimedia /home/rzl/httpbb/trixie/httpbb_0.0.5-1+deb13u1_amd64.changes # [[phab:T427899|T427899]]
* 22:52 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] (duration: 07m 25s)
* 22:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:46 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304195{{!}}hCaptcha: Re-enable for mcrundo (T427612)]]
* 21:29 maryum: Deployed security fix for [[phab:T428833|T428833]]
* 21:14 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] (duration: 07m 54s)
* 21:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 21:10 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 21:09 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:08 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303493{{!}}Prevent surveys being automatically added to non-Wikipedias (T393436)]]
* 20:12 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] (duration: 08m 20s)
* 20:08 dani@deploy1003: dani: Continuing with deployment
* 20:06 dani@deploy1003: dani: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1303895{{!}}Deploy English Wikipedia Mobile App Survey (T428876)]]
* 19:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*
* 19:09 cdobbins@dns1004: END - running authdns-update
* 19:08 cdobbins@dns1004: START - running authdns-update
* 19:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=dns7002.*,service=authdns-update
* 19:05 cdobbins@dns1004: END - running authdns-update
* 19:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2002.codfw.wmnet with reason: Host Replacement
* 19:03 cdobbins@dns1004: START - running authdns-update
* 19:01 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dns7002.wikimedia.org
* 19:01 cdobbins@cumin2002: START - Cookbook sre.hosts.remove-downtime for dns7002.wikimedia.org
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns7002.wikimedia.org with OS bookworm
* 18:39 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:37 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:34 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:33 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:31 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:29 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:28 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6hwb5 -n kube-system - [[phab:T429156|T429156]]
* 18:27 swfrench-wmf: (eqiad) kubectl delete pod coredns-54cdd9bdf-6n4ps -n kube-system - [[phab:T429156|T429156]]
* 18:26 jhuneidi@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] (duration: 08m 46s)
* 18:21 jhuneidi@deploy1003: jhuneidi, jforrester: Continuing with deployment
* 18:19 jhuneidi@deploy1003: jhuneidi, jforrester: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 jhuneidi@deploy1003: Started scap sync-world: Backport for [[gerrit:1304067{{!}}SpecialSpecialPages: Guard against special pages with no content-language alias (T429584)]]
* 18:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 17:37 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS bookworm
* 16:28 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] (duration: 06m 46s)
* 16:24 zabe@deploy1003: zabe: Continuing with deployment
* 16:24 zabe@deploy1003: zabe: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:22 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1304112{{!}}Add script to fix fr_archive_name drifts (T428406)]]
* 15:55 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] (duration: 06m 49s)
* 15:51 zabe@deploy1003: zabe: Continuing with deployment
* 15:51 zabe@deploy1003: zabe: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1303981{{!}}LocalFileMoveBatch: Also update fr_archive_name when moving file (T428406)]]
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:08 elukey@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 15:08 elukey@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 15:04 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] (duration: 11m 17s)
* 15:00 cscott@deploy1003: ihurbain, cscott: Continuing with deployment
* 14:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:57 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:55 cscott@deploy1003: ihurbain, cscott: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:53 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1304082{{!}}Check that data-parsoid is an array before accessing it as such (T429582)]]
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:46 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: trixie homer deploy - ayounsi@cumin1003
* 14:42 moritzm: installing zsh updates from Bookworm point release
* 14:37 brouberol@dns1004: END - running authdns-update
* 14:35 brouberol@dns1004: START - running authdns-update
* 14:27 jgreen@dns1004: END - running authdns-update
* 14:25 jgreen@dns1004: START - running authdns-update
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2007.codfw.wmnet
* 14:21 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2008.codfw.wmnet
* 14:21 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2008.codfw.wmnet
* 14:20 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 14:20 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 14:19 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2235.codfw.wmnet
* 14:19 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2235.codfw.wmnet
* 14:14 Msz2001: Finished deploying private code change
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2235.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2008.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 14:08 moritzm: installing unbound security updates
* 14:07 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2234.codfw.wmnet
* 14:07 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2234.codfw.wmnet
* 14:00 tgr_: UTC afternoon deploys done
* 14:00 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] (duration: 11m 51s)
* 13:56 tgr@deploy1003: tgr: Continuing with deployment
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2234.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2007.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy2005.codfw.wmnet
* 13:52 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy2005.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2232.codfw.wmnet
* 13:51 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 13:51 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 13:50 tgr@deploy1003: tgr: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:48 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1304038{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]], [[gerrit:1304039{{!}}Fix CentralAuthPostLoginRedirect type parameter on token loss (T429495)]]
* 13:46 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] (duration: 08m 15s)
* 13:42 tgr@deploy1003: javiermonton, tgr, anzx: Continuing with deployment
* 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:40 tgr@deploy1003: javiermonton, tgr, anzx: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1303613{{!}}magwiki: add wordmark, metanamespace, sitename and timezone (T428279)]], [[gerrit:1304004{{!}}stream: webrequest.page_trending.dev0 (T429588)]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2232.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy2005.codfw.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 13:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] (duration: 06m 56s)
* 13:26 ladsgroup@deploy1003: ladsgroup, bpirkle: Continuing with deployment
* 13:25 ladsgroup@deploy1003: ladsgroup, bpirkle: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303004{{!}}REST: Adjust key of Reading Lists OpenAPI spec in RestSandboxSpecs (T422771)]]
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] (duration: 10m 55s)
* 13:14 ladsgroup@deploy1003: ladsgroup, lerickson: Continuing with deployment
* 13:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:10 ladsgroup@deploy1003: ladsgroup, lerickson: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to drbd
* 13:08 fabfur: deploying new haproxykafka on A:cp to parse for x_provenance ([[phab:T427068|T427068]])
* 13:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1302923{{!}}EventStreamConfig: add stream for WDQS V2 external/internal queries. (T429380)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of testvm2005.codfw.wmnet to plain
* 13:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of testvm2005.codfw.wmnet to plain
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:03 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis magwiki in section s5
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2004.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2003.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2002.codfw.wmnet
* 13:00 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=codfw,name=dse-k8s-wdqs2001.codfw.wmnet
* 12:56 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:39 fabfur: upgrade haproxykafka on cp1111 to test for new x-provenance field ([[phab:T427068|T427068]])
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:35 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis magwiki in section s5
* 12:34 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 12:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis magwiki in section s5
* 12:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] (duration: 17m 49s)
* 12:29 cwilliams@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis magwiki in section s5
* 12:27 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:16 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:14 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1304017{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]], [[gerrit:1304016{{!}}TranslatePage: Cast to string before using htmlspecialchars (T429459)]]
* 11:10 atsukoito: atsuko updated charlie to 0.0.19 https://w.wiki/RPKN
* 10:37 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.disable-merges (exit_code=99)
* 10:37 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 10:24 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] (duration: 12m 13s)
* 10:19 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303986{{!}}hCaptcha: Recompute blocked-edit risk score block IDs server-side (T428394)]]
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 10:01 fabfur@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]
* 10:00 fabfur@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Change provenance var context - fabfur@cumin1003 - [[phab:T427068|T427068]]"
* 09:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] (duration: 08m 10s)
* 09:55 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:51 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1303983{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]], [[gerrit:1303982{{!}}CaptchaScoreHooks: Log risk score for every non-exempt edit (T429481)]]
* 09:33 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:33 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:11 moritzm: installing apache2 security updates
* 08:55 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 08:53 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:51 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:22 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 08:21 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 08:20 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 08:19 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 08:05 moritzm: regenerate pbuilder environments on build2001 to use deb.debian.org [[phab:T416707|T416707]]
* 08:02 moritzm: uploaded wmf-laptop 1.0.6 to component/wmf-laptop on apt.wikimedia.org
* 08:01 moritzm: regenerate pbuilder environments on build2002 to use deb.debian.org [[phab:T416707|T416707]]
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2040: Migration of es2040.codfw.wmnet completed
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2040: Migration of es2040.codfw.wmnet completed
* 05:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2040.codfw.wmnet with OS trixie
* 05:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 05:41 marostegui@cumin1003: Removing db1224 from zarcillo [[phab:T429561|T429561]]
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1224.eqiad.wmnet
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:40 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1224.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:36 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2040.codfw.wmnet with reason: host reimage
* 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db1224.eqiad.wmnet
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db1224 from dbctl [[phab:T429561|T429561]]', diff saved to https://phabricator.wikimedia.org/P94269 and previous config saved to /var/cache/conftool/dbconfig/20260618-052737-marostegui.json
* 05:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2040.codfw.wmnet with OS trixie
* 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2040: Upgrading es2040.codfw.wmnet
* 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2040: Upgrading es2040.codfw.wmnet
* 05:12 marostegui@cumin1003: dbmaint on es7@codfw [[phab:T429463|T429463]]
* 05:12 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 45s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] (duration: 06m 55s)
* 01:15 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 01:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303600{{!}}Update interwiki map (T428266)]]
* 00:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] (duration: 07m 25s)
* 00:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:42 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:40 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303596{{!}}Activate magwiki (T428266)]]
* 00:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] (duration: 07m 14s)
* 00:29 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1303594{{!}}Init magwiki (T428266)]]
== 2026-06-17 ==
* 23:26 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] (duration: 06m 46s)
* 23:22 egardner@deploy1003: egardner: Continuing with deployment
* 23:21 egardner@deploy1003: egardner: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:19 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303504{{!}}Enable beta mobile MMV on Wikipedias (T426775)]]
* 23:17 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 06m 55s)
* 23:14 mutante: gerrit2002 - unlink /srv/gerrit/site_path/review_site/logs/logs ([[phab:T425667|T425667]])
* 23:12 egardner@deploy1003: egardner: Continuing with deployment
* 23:12 egardner@deploy1003: egardner: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303552{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303553{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303554{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 23:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] (duration: 12m 31s)
* 22:57 egardner@deploy1003: egardner: Continuing with deployment
* 22:56 egardner@deploy1003: egardner: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:52 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1303571{{!}}Image Browsing: fix transparent images in carousel (T429047)]], [[gerrit:1303572{{!}}MMV Beta Viewer: Make in-flight image downloads abortable (T429193)]], [[gerrit:1303573{{!}}MMV Beta Viewer: Delay the loading indicator on quick navigation (T429193)]]
* 22:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] (duration: 31m 01s)
* 22:32 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:31 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:14 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303517{{!}}Donor Delight Badge: Add accessible label and hide popover from AT (T427313)]]
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:52 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:29 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:28 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:27 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:23 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:22 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:21 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:20 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:15 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:12 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:09 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:06 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:05 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:02 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 20:45 cdobbins@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dns7002.wikimedia.org with reason: bird.service keeps failing
* 20:41 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-ats (exit_code=0) rolling restart_daemons on A:cp
* 20:41 cdobbins@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns7002.wikimedia.org with OS trixie
* 20:36 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] (duration: 08m 26s)
* 20:31 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 20:29 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1303012{{!}}Enable ULS v2 on group1 wikis]]
* 20:17 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] (duration: 06m 55s)
* 20:13 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 20:12 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1303365{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]], [[gerrit:1303364{{!}}migrateMentorStatusAway: Return SIMULATED for all dry-run executions (T409170)]]
* 19:44 jgreen@dns1005: END - running authdns-update
* 19:42 jgreen@dns1005: START - running authdns-update
* 19:31 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005*<nowiki>}</nowiki> and A:liberica ([[phab:T428229|T428229]])
* 19:16 jhuneidi@deploy1003: Finished scap sync-world: wmf.7 to group 1 (Take 2) (duration: 07m 01s)
* 19:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:10 jhuneidi@deploy1003: Started scap sync-world: wmf.7 to group 1 (Take 2)
* 19:08 jhuneidi@deploy1003: Finished scap sync-world: Attempt to roll wmf.7 to group 1 (duration: 07m 24s)
* 19:01 jhuneidi@deploy1003: Started scap sync-world: Attempt to roll wmf.7 to group 1
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcontrol1008-dev.eqiad.wmnet
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:00 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:59 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcontrol1008-dev.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 18:52 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 18:46 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcontrol1008-dev.eqiad.wmnet
* 18:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6011.*
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6011.drmrs.wmnet
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp6011.drmrs.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp6011.drmrs.wmnet with reason: ats restart, continuing from failed cookbook run
* 18:17 brett: commit new lvs5005 IP address to cr2-eqsin.wikimedia.org,cr3-eqsin.wikimedia.org
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp6011.drmrs.wmnet
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp6011.drmrs.wmnet
* 18:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6011.*
* 17:41 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bookworm
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
* 17:06 mutante: contint1003 - even with gerrit:1301416 jenkins was STILL restarted :/ - stopping it manually and puppet - debugging - [[phab:T418521|T418521]]
* 17:03 mutante: contint1003 - re-enabling puppet - checking it does NOT start jenkins - also see gerrit:1297236 and gerrit:1301416 - [[phab:T418521|T418521]]
* 16:51 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:51 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:49 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-ats rolling restart_daemons on A:cp
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host lvs5005
* 16:48 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs5005
* 16:48 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:47 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs5005
* 16:47 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:47 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: START - Cookbook sre.dns.wipe-cache lvs5005.eqsin.wmnet 6.0.132.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host lvs5005 - brett@cumin2002"
* 16:45 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:45 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:39 brett@cumin2002: START - Cookbook sre.dns.netbox
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1078.eqiad.wmnet with OS trixie
* 16:16 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:16 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host lvs5005
* 16:16 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bookworm
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 16:15 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:11 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 16:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:02 brett@cumin2002: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 16:00 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on A:cp and not P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:58 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:54 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Migration of es2048.codfw.wmnet completed
* 15:53 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1078.eqiad.wmnet with reason: host reimage
* 15:47 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1007.eqiad.wmnet with reason: host reimage
* 15:46 moritzm: installing python-ldap security updates
* 15:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1078.eqiad.wmnet with OS trixie
* 15:30 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:27 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Migration of es2048.codfw.wmnet completed
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1004.eqiad.wmnet with OS trixie
* 15:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab (duration: 01m 24s)
* 15:00 aokoth@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab
* 14:59 cdobbins@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2048.codfw.wmnet with OS trixie
* 14:56 cdobbins@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns7002.wikimedia.org with reason: host reimage
* 14:44 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:35 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1004.eqiad.wmnet with reason: host reimage
* 14:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2048.codfw.wmnet with reason: host reimage
* 14:28 cdobbins@cumin1003: START - Cookbook sre.hosts.reimage for host dns7002.wikimedia.org with OS trixie
* 14:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] (duration: 07m 49s)
* 14:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:21 cjd91: depooling dns7002 to attempt reimage to trixie
* 14:20 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1004.eqiad.wmnet with OS trixie
* 14:19 cdobbins@cumin1003: conftool action : set/pooled=no; selector: name=dns7002.*
* 14:18 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1303436{{!}}Add Wikidata configuration for WikiProject links (T422935 T422936)]]
* 14:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2048.codfw.wmnet with OS trixie
* 14:17 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 14:17 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 14:16 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 14:16 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2048: Upgrading es2048.codfw.wmnet
* 14:13 elukey: add basic Kafka ACLs for anonymous to logging-eqiad - [[phab:T425528|T425528]]
* 14:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:13 Lucas_WMDE: UTC afternoon backport+config window done
* {{safesubst:SAL entry|1=14:13 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerr}}
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs-test1001.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1003.eqiad.wmnet
* 14:12 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1002.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs1001.eqiad.wmnet
* 14:11 btullis@puppetserver1001: conftool action : set/weight=10; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-wdqs*.eqiad.wmnet
* 14:08 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Continuing with deployment
* 14:06 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:01 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 14:00 jmm@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 13:58 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* {{safesubst:SAL entry|1=13:55 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, abi: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[ge}}
* {{safesubst:SAL entry|1=13:53 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1302739{{!}}ULS rewrite: Lock body scroll when open on mobile]], [[gerrit:1302743{{!}}ULS rewrite: Fix settings dialog width and field sizing (T416512)]], [[gerrit:1303010{{!}}ULS rewrite: Show variants even when no languages are available (T426532)]], [[gerrit:1303009{{!}}ULS rewrite: Capture trigger element before async module load (T429145)]], [[gerri}}
* 13:52 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 13:51 jmm@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 13:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.bmc-user-mgmt (exit_code=0) for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:50 elukey@cumin1003: START - Cookbook sre.hosts.bmc-user-mgmt for host sretest[2001,2003-2004,2006,2009-2010].codfw.wmnet,sretest1005.eqiad.wmnet
* 13:47 papaul: mgmt interface change on mr-codfw
* 13:46 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: mgmt interface change
* 13:45 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw with reason: switch refresh
* 13:42 jmm@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 jmm@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] (duration: 08m 14s)
* 13:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1006.eqiad.wmnet with OS trixie
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudcephosd1016
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudcephosd1016
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1061
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1061
* 13:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1069
* 13:31 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Rolling back deployment
* 13:31 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1069
* 13:30 cmooney@cumin1003: END (PASS) - Cookbook sre.network.cloud-host (exit_code=0) for host cloudvirt1068
* 13:30 cmooney@cumin1003: START - Cookbook sre.network.cloud-host for host cloudvirt1068
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-gp1005.eqiad.wmnet with OS trixie
* 13:27 lucaswerkmeister-wmde@deploy1003: sadiyamohammed13, lucaswerkmeister-wmde: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298293{{!}}Add Wikidata configuration for WikiProject links (T422935)]], [[gerrit:1299943{{!}}Add instance-of WikiProject links for paintings and elections (T422936)]]
* 13:24 jmm@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 13:23 jmm@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 13:14 dani@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] (duration: 07m 53s)
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:11 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 13:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 13:10 dani@deploy1003: dani: Continuing with deployment
* 13:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1045: repool after upgrade
* 13:08 dani@deploy1003: dani: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1006.eqiad.wmnet with reason: host reimage
* 13:06 dani@deploy1003: Started scap sync-world: Backport for [[gerrit:1302998{{!}}Add English Wikipedia Mobile App Survey (T428876)]]
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-gp1005.eqiad.wmnet with reason: host reimage
* 13:00 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:53 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1006
* 12:52 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1006
* 12:51 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1006
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1006.eqiad.wmnet 182.48.64.10.in-addr.arpa 2.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host mc-gp1005
* 12:51 blake@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host mc-gp1005
* 12:49 blake@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc-gp1005
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: START - Cookbook sre.dns.wipe-cache mc-gp1005.eqiad.wmnet 126.32.64.10.in-addr.arpa 6.2.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:49 blake@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host mc-gp1005 - blake@cumin1003"
* 12:48 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 12:45 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 12:43 blake@cumin1003: START - Cookbook sre.dns.netbox
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1006
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:41 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1006.eqiad.wmnet with OS trixie
* 12:41 blake@cumin1003: START - Cookbook sre.hosts.move-vlan for host mc-gp1005
* 12:40 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-gp1005.eqiad.wmnet with OS trixie
* 12:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Migration of db1163.eqiad.wmnet completed
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:32 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:32 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 12:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2004.codfw.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2003.codfw.wmnet with reason: host reimage
* 12:24 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1045: repool after upgrade
* 12:23 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Security updates ([[phab:T426585|T426585]]) - klausman@cumin1003
* 12:22 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1045.eqiad.wmnet with OS trixie
* 12:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2004.codfw.wmnet with OS bookworm
* 12:16 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2003.codfw.wmnet with OS bookworm
* 12:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2001.codfw.wmnet with reason: host reimage
* 12:13 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:07 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2044: repool after maintenance es2044
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 12:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1045.eqiad.wmnet with reason: host reimage
* 11:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:51 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Migration of db1163.eqiad.wmnet completed
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1045.eqiad.wmnet with OS trixie
* 11:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1045: Upgrading es1045.eqiad.wmnet
* 11:42 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1163.eqiad.wmnet with OS trixie
* 11:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:35 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs2002.codfw.wmnet with reason: host reimage
* 11:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1191.eqiad.wmnet with reason: upgrading
* 11:23 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 11:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1172.eqiad.wmnet with reason: upgrading
* 11:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host dse-k8s-wdqs2001.codfw.wmnet
* 11:21 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1:00:00 on db1171.eqiad.wmnet with reason: upgrading
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: upgrading
* 11:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1163.eqiad.wmnet with reason: host reimage
* 11:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2044: repool after maintenance es2044
* 11:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2044.codfw.wmnet with OS trixie
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:11 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:10 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:09 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1038: Migration of es1038.eqiad.wmnet completed
* 11:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1163.eqiad.wmnet with OS trixie
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:01 moritzm: The Debian mirror on mirrors.wikimedia.org has been disabled [[phab:T416707|T416707]]
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.dhcp for host dse-k8s-wdqs2001.codfw.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1163: Upgrading db1163.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2044.codfw.wmnet with reason: host reimage
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2203: Migration of db2203.codfw.wmnet completed
* 10:43 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1003.eqiad.wmnet with reason: host reimage
* 10:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2044.codfw.wmnet with OS trixie
* 10:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:35 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:34 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS bookworm
* 10:29 moritzm: installing git-lfs security updates
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1038: Migration of es1038.eqiad.wmnet completed
* 10:22 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 10:21 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:17 claime: cumin -x 'A:swift-fe' "enable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS trixie
* 10:12 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 10:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2203: Migration of db2203.codfw.wmnet completed
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1002.eqiad.wmnet with reason: host reimage
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
* 09:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS trixie
* 09:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2045: repool after maintenance es2045
* 09:48 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] (duration: 15m 32s)
* 09:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1038: Upgrading es1038.eqiad.wmnet
* 09:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1038: Upgrading es1038.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:37 marostegui@dns1004: END - running authdns-update
* 09:36 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad back to read-write - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94226 and previous config saved to /var/cache/conftool/dbconfig/20260617-093559-marostegui.json
* 09:35 marostegui@dns1004: START - running authdns-update
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1038 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94225 and previous config saved to /var/cache/conftool/dbconfig/20260617-093513-marostegui.json
* 09:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:33 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1037 to es6 primary [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94224 and previous config saved to /var/cache/conftool/dbconfig/20260617-093310-marostegui.json
* 09:32 marostegui: Starting es6 eqiad failover from es1038 to es1037 - [[phab:T429436|T429436]]
* 09:32 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1303356{{!}}hCaptcha: Remove config for VE and DT enable (T428883)]], [[gerrit:1303354{{!}}Drop $wgDiscussionToolsHCaptchaRequiredForAllEdits (T428883)]]
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es1037 with weight 0 [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94223 and previous config saved to /var/cache/conftool/dbconfig/20260617-092940-marostegui.json
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429436|T429436]]
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS bookworm
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es6 eqiad as read-only for maintenance - [[phab:T429436|T429436]]', diff saved to https://phabricator.wikimedia.org/P94222 and previous config saved to /var/cache/conftool/dbconfig/20260617-092913-marostegui.json
* 09:27 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
* 09:26 jynus: testing x1 backups @ cumin2003 [[phab:T427897|T427897]]
* 09:11 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS trixie
* 09:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2203: Upgrading db2203.codfw.wmnet
* 09:09 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:07 elukey: add basic Kafka ACLs for anonymous to logging-codfw - [[phab:T425528|T425528]] (I'll add rollback steps in the task if needed)
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2045: repool after maintenance es2045
* 09:06 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es2044: Upgrading es2044.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2044: Upgrading es2044.codfw.wmnet
* 09:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2046 to es5 codfw primary [[phab:T428572|T428572]]', diff saved to https://phabricator.wikimedia.org/P94219 and previous config saved to /var/cache/conftool/dbconfig/20260617-090221-marostegui.json
* 09:02 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:01 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 joal@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:59 joal@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2203 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94218 and previous config saved to /var/cache/conftool/dbconfig/20260617-085615-cwilliams.json
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2009.codfw.wmnet with OS trixie
* 08:55 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:55 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2212 to s1 primary [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94217 and previous config saved to /var/cache/conftool/dbconfig/20260617-085310-cwilliams.json
* 08:51 cezmunsta: Starting s1 codfw failover from db2203 to db2212 - [[phab:T429190|T429190]]
* 08:51 marostegui@dns1004: END - running authdns-update
* 08:49 marostegui@dns1004: START - running authdns-update
* 08:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:46 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2212 with weight 0 [[phab:T429190|T429190]]', diff saved to https://phabricator.wikimedia.org/P94215 and previous config saved to /var/cache/conftool/dbconfig/20260617-084642-cwilliams.json
* 08:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:46 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T429190|T429190]]
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1044: repool after upgrade
* 08:38 jelto: "Imported helm3 3.19.5-1 to bullseye-wikimedia, bookworm-wikimedia and trixie-wikimedia - [[phab:T427403|T427403]]"
* 08:38 moritzm: installing apache2 security updates
* 08:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:35 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:31 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2009.codfw.wmnet with reason: host reimage
* 08:25 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] (duration: 35m 34s)
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2008.codfw.wmnet with OS trixie
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:17 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2009.codfw.wmnet with OS trixie
* 08:12 mlitn@deploy1003: mlitn: Continuing with deployment
* 08:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:09 mlitn@deploy1003: mlitn: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host conf2007.codfw.wmnet with OS trixie
* 08:04 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:03 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 08:01 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1044: repool after upgrade
* 08:00 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2008.codfw.wmnet with reason: host reimage
* 07:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1044.eqiad.wmnet with OS trixie
* 07:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:49 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1303296{{!}}Squashed diff to master]], [[gerrit:1303295{{!}}Squashed diff to master]]
* 07:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 07:42 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2008.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on conf2007.codfw.wmnet with reason: host reimage
* 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1044.eqiad.wmnet with reason: host reimage
* 07:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 07:23 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 07:22 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:22 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003
* 07:21 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes (attempt 3) - oblivian@cumin1003"
* 07:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1044.eqiad.wmnet with OS trixie
* 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1044: Upgrading es1044.eqiad.wmnet
* 07:15 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:53 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 06:52 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:46 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 06:46 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1037: Migration of es1037.eqiad.wmnet completed
* 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS trixie
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
* 05:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1037: Upgrading es1037.eqiad.wmnet
* 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 00:01 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
* 00:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
== 2026-06-16 ==
* 23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2006.codfw.wmnet with reason: host reimage
* 23:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp - OpenSSL update ()
* 23:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:52 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet
* 22:50 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:49 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host wikikube-ctrl2006.codfw.wmnet
* 22:30 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 22:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:08 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:07 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] (duration: 08m 11s)
* 22:02 kemayo@deploy1003: kemayo: Continuing with deployment
* 22:01 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302953{{!}}Update VE core submodule to master (0930c3a9e) (T406841 T429174 T397501 T424632 T429355)]], [[gerrit:1302952{{!}}Update VE core submodule to master (0930c3a9e) (T397501 T424632 T429355)]]
* 21:52 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 21:50 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 21:49 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:48 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 21:48 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 ryankemper@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:45 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:38 robh@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:34 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] (duration: 18m 41s)
* 21:32 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:30 robh@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:29 cscott@deploy1003: arlolra, cscott: Continuing with deployment
* 21:26 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 21:25 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 21:24 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 21:24 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:23 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 21:21 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 21:20 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 21:20 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS bookworm
* 21:17 cscott@deploy1003: arlolra, cscott: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1302934{{!}}Update definition of html heading to match Parsoid/core (T417530 T417531 T428677)]]
* 21:10 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:08 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:54 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 20:51 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] (duration: 09m 28s)
* 20:47 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1302890{{!}}Guard round function with a supports query (T424596)]], [[gerrit:1302935{{!}}Add wprov parameter to home link (T429268)]]
* 20:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*
* 20:33 brett@dns1004: END - running authdns-update
* 20:31 brett@dns1004: START - running authdns-update
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns5004.wikimedia.org with OS bookworm
* 20:30 brett@dns5004: FAIL - running authdns-update
* 20:29 brett@dns5004: START - running authdns-update
* 20:28 brett@dns5004: FAIL - running authdns-update
* 20:27 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] (duration: 09m 50s)
* 20:26 brett@dns5004: START - running authdns-update
* 20:26 brett@dns5004: START - running authdns-update
* 20:25 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns5004.*,service=authdns-update
* 20:23 kemayo@deploy1003: kemayo: Continuing with deployment
* 20:19 kemayo@deploy1003: kemayo: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 20:17 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1302320{{!}}EditChecks: Namespace tracking object for seen/shown/used checks]]
* 20:09 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 20:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:56 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs1001.eqiad.wmnet with reason: host reimage
* 19:55 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:55 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:54 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS bookworm
* 19:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS bookworm
* 19:39 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:35 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:31 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:30 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp - OpenSSL update ()
* 19:27 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:18 topranks: restarting grpc server on eqiad SR-Linux switches to recover from problem of no free threads [[phab:T429242|T429242]]
* 19:08 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:08 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 19:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:00 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] (duration: 11m 18s)
* 18:58 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:56 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:55 krinkle@deploy1003: krinkle: Continuing with deployment
* 18:52 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:51 krinkle@deploy1003: krinkle: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:48 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1302274{{!}}Disable ShortUrl on hiwiki, hiwikiversity, maiwiki, knwiki, knwikisource, tcywiki (T107188)]]
* 18:45 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:41 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:41 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns5004.wikimedia.org with reason: host reimage
* 18:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 18:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 18:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 18:34 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:33 robh@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:30 robh@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 18:23 jhuneidi@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.7 refs [[phab:T423916|T423916]]
* 18:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host dns5004
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns5004
* 18:08 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns5004
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: START - Cookbook sre.dns.wipe-cache dns5004.wikimedia.org 8.166.102.103.in-addr.arpa 8.0.0.0.6.6.1.0.2.0.1.0.3.0.1.0.1.0.0.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:08 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host dns5004 - brett@cumin2002"
* 18:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 brett@cumin2002: START - Cookbook sre.dns.netbox
* 18:00 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:53 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=dns5004.*
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:47 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host dns5004
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host dns5004.wikimedia.org with OS bookworm
* 17:44 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change mgmt name for frproto1001 - cmooney@cumin1003"
* 17:43 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host conf2007.codfw.wmnet with OS trixie
* 17:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] (duration: 32m 19s)
* 17:38 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 17:30 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified t
* 17:27 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host conf2007.codfw.wmnet with OS trixie
* 17:25 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:20 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302912{{!}}Revert^2 "hCaptcha: Enable for UploadWizard on all wikis with it"]], [[gerrit:1302909{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]], [[gerrit:1302908{{!}}PublishCaptchaHandler: Only require CAPTCHA for UploadWizard (T429322)]]
* 16:35 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:09 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]] (duration: 00m 45s)
* 16:08 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab1004 - [[phab:T429350|T429350]]
* 16:08 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 16:08 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]] (duration: 00m 47s)
* 16:07 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: deploy phab2002 - [[phab:T429350|T429350]]
* 16:06 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 16:04 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:42 urbanecm@deploy1003: mwscript-k8s job started: GrowthExperiments:migrateMentorStatusAway --wiki=abwiki --dry-run # [[phab:T409170|T409170]]
* 15:39 moritzm: installing Tomcat security updates
* 15:38 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from `updatelog` on all wikis in `growthexperiments.dblist` ([[phab:T409170|T409170]])
* 15:38 dancy@deploy1003: Installation of scap version "4.269.0" completed for 2 hosts
* 15:36 dancy@deploy1003: Installing scap version "4.269.0" for 2 host(s)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 49s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@a640ed9]: test deploy phab2003 - [[phab:T427286|T427286]]
* 15:16 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:16 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 15:07 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-welcome # [[phab:T429352|T429352]]
* 15:06 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-discovery # [[phab:T429352|T429352]]
* 15:03 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-homepage-mentorship # [[phab:T429352|T429352]]
* 15:01 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] (duration: 10m 00s)
* 14:57 awight@deploy1003: seanleong-wmde, awight: Continuing with deployment
* 14:55 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments purgeUserOptions.php --login-age 1 growthexperiments-tour-help-panel # [[phab:T429352|T429352]]
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:54 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for frproto1001 (formerly payments1008) - cmooney@cumin1003"
* 14:53 awight@deploy1003: seanleong-wmde, awight: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:51 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1302804{{!}}Hotfix for T428620 (T428620)]]
* 14:48 aokoth@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab (duration: 02m 09s)
* 14:46 aokoth@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab
* 14:28 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:28 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 14:07 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] (duration: 11m 29s)
* 14:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:03 dcausse@deploy1003: jgiannelos, dcausse: Continuing with deployment
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:58 dcausse@deploy1003: jgiannelos, dcausse: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1302792{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T417530 T428105 T429187)]], [[gerrit:1302793{{!}}Bump wikimedia/parsoid to 0.24.0-a10 (T429187)]]
* 13:54 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:52 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:52 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:51 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:48 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] (duration: 04m 10s)
* 13:47 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:46 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302850{{!}}Revert "translate: remove CirrusSearch endpoints"]]
* 13:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 13:43 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:40 cmooney@cumin2002: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:39 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] (duration: 11m 16s)
* 13:37 atsuko@deploy1003: atsuko: Rolling back deployment
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1080.eqiad.wmnet with OS trixie
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:36 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:34 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1079.eqiad.wmnet with OS trixie
* 13:32 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:30 atsuko@deploy1003: atsuko: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1302197{{!}}translate: remove CirrusSearch endpoints (T425377)]]
* 13:25 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] (duration: 08m 50s)
* 13:25 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:21 dcausse@deploy1003: dcausse, neriah: Continuing with deployment
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 13:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:18 dcausse@deploy1003: dcausse, neriah: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:16 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1299626{{!}}Replace wgNewUserMessageOnAutoCreate with wgNewUserMessageOnFirstEdit (T426206)]]
* 13:15 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:12 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1080.eqiad.wmnet with reason: host reimage
* 13:12 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] (duration: 08m 35s)
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1008.eqiad.wmnet with OS trixie
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:07 jmm@dns1004: END - running authdns-update
* 13:06 mfossati@deploy1003: ksarabia, mfossati: Continuing with deployment
* 13:05 mfossati@deploy1003: ksarabia, mfossati: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 jmm@dns1004: START - running authdns-update
* 13:03 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1298875{{!}}Remove custom streams (T423148)]]
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1079.eqiad.wmnet with reason: host reimage
* 13:02 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 13:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1080.eqiad.wmnet with OS trixie
* 12:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1079.eqiad.wmnet with OS trixie
* 12:52 cmooney@cumin2002: START - Cookbook sre.mysql.pool pool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:51 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 12:50 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
* 12:48 cmooney@cumin1003: START - Cookbook sre.mysql.pool pool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2255.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2254.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2243.codfw.wmnet
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2242.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2242.codfw.wmnet
* 12:47 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 12:47 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2092.codfw.wmnet
* 12:47 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2092.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2091.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 29 hosts
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2078.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2077.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin1003: START - Cookbook sre.hosts.remove-downtime for 29 hosts
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2076.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2075.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2074.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2044.codfw.wmnet
* 12:46 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2041.codfw.wmnet
* 12:46 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2018.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2017.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2014.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2013.codfw.wmnet
* 12:45 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2012.codfw.wmnet
* 12:45 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2012.codfw.wmnet
* 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1008.eqiad.wmnet with reason: host reimage
* 12:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:28 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:24 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1008.eqiad.wmnet with OS trixie
* 12:24 topranks: reboot lsw1-a5-codfw to complete JunOS upgrade [[phab:T428020|T428020]]
* 12:23 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1007.eqiad.wmnet with OS trixie
* 12:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1006.eqiad.wmnet with reason: host reimage
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2255.codfw.wmnet
* 12:19 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2254.codfw.wmnet
* 12:18 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2254.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2243.codfw.wmnet
* 12:17 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2242.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2092.codfw.wmnet
* 12:16 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2091.codfw.wmnet
* 12:15 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2078.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2077.codfw.wmnet
* 12:14 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2076.codfw.wmnet
* 12:13 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2075.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2074.codfw.wmnet
* 12:12 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2051.codfw.wmnet
* 12:10 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 29 hosts with reason: lsw1-a5-codfw JunOS upgrade
* 12:07 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2051.codfw.wmnet
* 12:06 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a5-codfw,lsw1-a5-codfw IPv6,lsw1-a5-codfw.mgmt,ssw1-a[1,8]-codfw.mgmt with reason: switch upgrrade
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2044.codfw.wmnet
* 12:06 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2041.codfw.wmnet
* 12:05 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2018.codfw.wmnet
* 12:05 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2018.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2017.codfw.wmnet
* 12:04 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2014.codfw.wmnet
* 12:03 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2013.codfw.wmnet
* 12:03 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2013.codfw.wmnet
* 12:02 cmooney@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2012.codfw.wmnet
* 12:02 cmooney@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 12:01 cmooney@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2012.codfw.wmnet
* 12:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 11:57 cmooney@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 11:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] (duration: 08m 45s)
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2176: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:49 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2175: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:48 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2157: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:48 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2154: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:47 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:46 cmooney@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 cmooney@cumin1003: START - Cookbook sre.mysql.depool depool db2153: codfw rack a5 depool for switch maintenance [[phab:T428020|T428020]]
* 11:46 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:46 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:43 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302794{{!}}Revert "hCaptcha: Enable for UploadWizard on all wikis with it"]]
* 11:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 11:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2035: Migration of es2035.codfw.wmnet completed
* 11:06 moritzm: installing Bird security updates on routed Ganeti nodes
* 10:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es1037 [[phab:T429118|T429118]]', diff saved to https://phabricator.wikimedia.org/P94172 and previous config saved to /var/cache/conftool/dbconfig/20260616-104931-marostegui.json
* 10:25 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 10:24 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2035: Migration of es2035.codfw.wmnet completed
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 10:24 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 10:24 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Migration of es1036.eqiad.wmnet completed
* 10:22 jmm@dns1004: END - running authdns-update
* 10:22 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:20 jmm@dns1004: START - running authdns-update
* 10:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:19 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 10:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:18 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 10:18 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 10:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2035.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:52 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2035.codfw.wmnet with reason: host reimage
* 09:49 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 09:48 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 09:47 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] (duration: 09m 38s)
* 09:43 marostegui: Drop wrongly created table son testwikidatawiki s3 master [[phab:T429304|T429304]]
* 09:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 09:39 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:38 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302762{{!}}hCaptcha: Enable for UploadWizard on all wikis with it (T426126)]]
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --wiki=wikidatawiki --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Migration of es1036.eqiad.wmnet completed
* 09:37 urbanecm@deploy1003: mwscript-k8s job started: extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=6hour --verbose --use-job-queue # [[phab:T418115|T418115]]
* 09:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2035.codfw.wmnet with OS trixie
* 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2035: Upgrading es2035.codfw.wmnet
* 09:34 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2035 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94164 and previous config saved to /var/cache/conftool/dbconfig/20260616-093247-marostegui.json
* 09:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2037 to es6 primary [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94163 and previous config saved to /var/cache/conftool/dbconfig/20260616-093149-marostegui.json
* 09:31 jayme: imported istioctl 1.29.4-1 to bookworm-/trixie-wikimedia - [[phab:T427401|T427401]]
* 09:30 marostegui: Starting es6 codfw failover from es2035 to es2037 - [[phab:T429303|T429303]]
* 09:30 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Set es2037 with weight 0 [[phab:T429303|T429303]]', diff saved to https://phabricator.wikimedia.org/P94162 and previous config saved to /var/cache/conftool/dbconfig/20260616-092937-marostegui.json
* 09:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: Primary switchover es6 [[phab:T429303|T429303]]
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS trixie
* 09:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:19 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] (duration: 16m 29s)
* 09:18 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:14 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 09:13 urbanecm: php multiversion/MWScript.php WikimediaMaintenance:createExtensionTables.php --wiki=<nowiki>{</nowiki>testwikidatawiki,wikidatawiki<nowiki>}</nowiki> growthexperiments # [[phab:T418115|T418115]], within mw-debug
* 09:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: END (PASS) - Cookbook sre.metamonitoring.downtime (exit_code=0) Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:07 tappof@cumin1003: START - Cookbook sre.metamonitoring.downtime Downtime for 0:05:00 of prometheus/deadmanswitchnotified, prometheus/deadmanswitchonamdb, prometheus/extmon on 2 host(s) with reason: cookbook test
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:04 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
* 09:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1297161{{!}}[Growth] wikidatawiki: Enable Growth features (T418115)]]
* 09:01 moritzm: uploaded bird 2.18.2-1~wmf13u1 to trixie-wikimedia [[phab:T429285|T429285]]
* 09:00 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist wikidata WikimediaMaintenance:createExtensionTables.php GrowthExperiments # [[phab:T418115|T418115]]
* 08:56 moritzm: uploaded bird 2.18.2-1~wmf12u1 to bookworm-wikimedia [[phab:T429285|T429285]]
* 08:48 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS trixie
* 08:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1036: Upgrading es1036.eqiad.wmnet
* 08:46 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] (duration: 19m 23s)
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1036: Upgrading es1036.eqiad.wmnet
* 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1047: repool after upgrade
* 08:42 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:32 moritzm: installing nginx security updates
* 08:29 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302735{{!}}hCaptcha: Enable for MobileFrontend in all wikis (T425940)]]
* 08:23 mszwarc@deploy1003: Synchronized private/PrivateSettings.php: Private code deployment for Suggested Investigations (duration: 02m 23s)
* 08:19 mszwarc@deploy1003: Synchronized private/SuggestedInvestigationsSignals: Private code deployment for Suggested Investigations (duration: 06m 03s)
* 08:17 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94157)
* 08:05 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] (duration: 11m 31s)
* 08:00 moritzm: update bird on ganeti7001 to 2.18.2-1~wmf12u1
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:58 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1047: repool after upgrade
* 07:54 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302629{{!}}Improve click intent event logging and exposure tracking]]
* 07:50 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] (duration: 36m 13s)
* 07:36 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:33 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:14 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1302170{{!}}Update VE core submodule to master (3e79e9934) (T397319 T428764)]]
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1047.eqiad.wmnet with OS trixie
* 06:50 hashar@deploy1003: Finished deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0 (duration: 00m 16s)
* 06:50 hashar@deploy1003: Started deploy [integration/docroot@2165507]: build: Updating js-yaml to 4.2.0
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:40 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1047.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1047.eqiad.wmnet with OS trixie
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1047: Upgrading es1047.eqiad.wmnet
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1047: Upgrading es1047.eqiad.wmnet
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 04:55 ryankemper: [[phab:T427951|T427951]] Deleted 4 leftover mirrored dev/test topics from kafka-test: `eqiad.mediawiki.<nowiki>{</nowiki>page_html_content_change.dev<nowiki>{</nowiki>1,4<nowiki>}</nowiki>,page_edit_type_simple.dev0<nowiki>}</nowiki>`, `eqiad.mw_page_edit_type_enrich.error`
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.4 (duration: 05m 29s)
== 2026-06-15 ==
* 22:35 sbassett: Deployed private config for [[phab:T429244|T429244]]
* 22:05 sbassett: Deployed updated security fix for [[phab:T427611|T427611]]
* 22:04 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 22:04 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 22:03 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:54 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] (duration: 12m 27s)
* 21:53 dancy@deploy1003: dancy: Continuing with deployment
* 21:49 dancy@deploy1003: dancy: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 sbassett: Deployed security fix for [[phab:T428809|T428809]]
* 21:48 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1302277{{!}}beta: Point remaining db11 references at deployment-db15 (T428930)]]
* 21:40 sbassett: Deployed security fix for [[phab:T428820|T428820]]
* 21:22 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] (duration: 08m 11s)
* 21:17 sbassett@deploy1003: sbassett: Continuing with deployment
* 21:15 sbassett@deploy1003: sbassett: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1302267{{!}}ForceReauth: Avoid unnecessary securitySensitiveOperationStatus checks]]
* 21:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 21:06 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 21:05 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 20:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:21 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] (duration: 10m 54s)
* 20:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 20:16 dancy@deploy1003: caro, dancy, bpirkle: Continuing with deployment
* 20:14 dancy@deploy1003: caro, dancy, bpirkle: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300245{{!}}REST: set new RestModuleOverrides variable (T422756)]], [[gerrit:1302232{{!}}Enable "exit the editor" survey on 11 wikis for phase 2 (T426132)]]
* 20:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2001.codfw.wmnet with OS trixie
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5028
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5028
* 19:43 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5028
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5028.eqsin.wmnet 25.0.132.10.in-addr.arpa 5.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:42 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5028 - brett@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:36 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:35 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet
* 19:34 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet
* 19:33 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet
* 19:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:33 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet
* 19:26 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5028
* 19:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:liberica-eqsin
* 19:21 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart A:liberica-eqsin
* 19:18 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:16 brett@cumin2002: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:15 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:14 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 19:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 19:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-purged (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:42 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-purged rolling restart_daemons on P<nowiki>{</nowiki>cp7001.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 18:18 mutante: releases2003 - systemctl stop tmp.mount
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5026
* 17:53 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5026
* 17:52 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5026
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5026.eqsin.wmnet 37.0.132.10.in-addr.arpa 7.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:52 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5026 - brett@cumin2002"
* 17:46 brett@cumin2002: START - Cookbook sre.dns.netbox
* 17:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-d8-eqiad
* 17:40 cmooney@cumin1003: START - Cookbook sre.network.tls for network device ssw1-d8-eqiad
* 17:36 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:35 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-c4-eqiad
* 17:34 cmooney@cumin1003: START - Cookbook sre.network.tls for network device lsw1-c4-eqiad
* 17:09 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5026
* 17:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 17:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:36 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 16:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 16:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add re}}
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 16:08 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Continuing with deployment
* 16:08 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* {{safesubst:SAL entry|1=16:07 dreamyjazz@deploy1003: reedy, dreamyjazz, kharlan: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add}}
* {{safesubst:SAL entry|1=16:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1302192{{!}}SourceEditorOverlayHookPayload: Allow aborting of the save (T428287)]], [[gerrit:1302194{{!}}hCaptcha MobileFrontend: Avoid indefinite save loop on known errors (T428287)]], [[gerrit:1302195{{!}}OATHUserRepository: Specify caller in query]], [[gerrit:1302186{{!}}Bump guzzlehttp/psr to version 2.11.0 (T429208)]], [[gerrit:1302169{{!}}NoReferrerLinks: Add rel}}
* 16:04 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:51 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: puppet debugging
* 15:50 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: puppet debugging
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1196: Migration of db1196.eqiad.wmnet completed
* 15:41 mutante: added new project language 'nyn' - Bantu language spoken by the Nkore and Hema peoples of Southwestern Uganda
* 15:40 dzahn@dns1006: END - running authdns-update
* 15:36 dzahn@dns1006: START - running authdns-update
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1155.eqiad.wmnet
* 15:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1154.eqiad.wmnet
* 15:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 11 hosts
* 15:18 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for 11 hosts
* 15:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-redacteddb1001.eqiad.wmnet
* 15:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for an-redacteddb1001.eqiad.wmnet
* 15:16 topranks: repool esams following cr2-esams rpd crash
* 15:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:13 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
* 15:02 topranks: depool esams due to cr2-esams rpd crash
* 15:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:01 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
* 15:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 14:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1196: Migration of db1196.eqiad.wmnet completed
* 14:54 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 14:52 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1077.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:24 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2001.codfw.wmnet with reason: tesT
* 14:24 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 14:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
* 14:08 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 14:07 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 14:07 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1077.eqiad.wmnet with reason: host reimage
* 14:06 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 14:05 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:04 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS trixie
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:02 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: revert deployment - oblivian@cumin1003
* 14:01 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "revert deployment - oblivian@cumin1003"
* 14:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1196: Upgrading db1196.eqiad.wmnet
* 14:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cloudvirt1077.eqiad.wmnet with OS trixie
* 13:56 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:54 federico3: doing a quick restart of sanitarium hosts db1155 and db1154
* 13:53 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94145)
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1154.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 11 hosts with reason: Reboots [[phab:T426633|T426633]]
* 13:49 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Reboots [[phab:T426633|T426633]]
* {{safesubst:SAL entry|1=13:43 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T}}
* 13:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 13:39 jforrester@deploy1003: arlolra, sgimeno, jforrester: Continuing with deployment
* {{safesubst:SAL entry|1=13:37 jforrester@deploy1003: arlolra, sgimeno, jforrester: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowik}}
* {{safesubst:SAL entry|1=13:35 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300835{{!}}Remove no longer used product_metrics.homepage_module_interaction (T365889 T426742)]], [[gerrit:1302153{{!}}TaskSuggester: avoid nullable logger in setLogger call]], [[gerrit:1302100{{!}}migrateMentorStatusAway: ensure validateStrictly receives objects (T409170)]], [[gerrit:1301451{{!}}Store nowiki source in StripState::extra to support subst-nowiki (T3}}
* 13:34 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2216: Migration of db2216.codfw.wmnet completed
* 13:29 topranks: enable BGP graceful-shutdown sender on cr2-esams to drain traffic [[phab:T427056|T427056]]
* 13:28 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-esams,cr2-esams IPv6 with reason: bouncing pic0 to reconfigure port speeds
* 13:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:26 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:25 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:25 topranks: cr2-esams, reconfigure chassis fpc to set port 0 to 100G [[phab:T427056|T427056]]
* 13:25 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Haproxy provenance maps in HP; UX changes - oblivian@cumin1003
* 13:24 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Haproxy provenance maps in HP; UX changes - oblivian@cumin1003"
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1251: Migration of db1251.eqiad.wmnet completed
* {{safesubst:SAL entry|1=13:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config f}}
* 13:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:18 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:17 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Continuing with deployment
* {{safesubst:SAL entry|1=13:13 jforrester@deploy1003: arlolra, matmarex, jforrester, dragoniez: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Te}}
* 13:13 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:12 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* {{safesubst:SAL entry|1=13:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1293173{{!}}Configure wgOAuthAutoApprove['protocols'] (T412542 T426614)]], [[gerrit:1300873{{!}}jawiki: remove four rights from the eliminator group (T428942)]], [[gerrit:1301401{{!}}Deploy PRV to 6 wikis (T429038)]], [[gerrit:1300858{{!}}[abstractwiki] Set wgForceUIMsgAsContentMsg for sidebar messages (T427730)]], [[gerrit:1300872{{!}}abstractwiki: Temporary config fo}}
* 13:10 moritzm: installing Linux 6.1.174 on Bookworm hosts
* 13:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:48 moritzm: installing augeas security updates
* 12:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2216: Migration of db2216.codfw.wmnet completed
* 12:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Migration of es2036.codfw.wmnet completed
* 12:38 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] (duration: 07m 42s)
* 12:37 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1251: Migration of db1251.eqiad.wmnet completed
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS trixie
* 12:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:34 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:32 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:31 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1302124{{!}}Extract a service that initiates SI signal matching (T428557)]], [[gerrit:1302125{{!}}Trigger Suggested Investigations when client hints are saved (T428557)]]
* 12:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1251.eqiad.wmnet with OS trixie
* 12:23 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 12:21 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 12:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
* 12:10 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 12:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 12:06 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1251.eqiad.wmnet with reason: host reimage
* 11:56 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:55 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Migration of es2036.codfw.wmnet completed
* 11:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS trixie
* 11:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2216: Upgrading db2216.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1251.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1251: Upgrading db1251.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver codfw-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on codfw-k8s (dblist: https://phabricator.wikimedia.org/P94128)
* 11:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94127)
* 11:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2036.codfw.wmnet with OS trixie
* 11:37 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2036.codfw.wmnet with reason: host reimage
* 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-eqiad
* 11:08 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-eqiad
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2036.codfw.wmnet with OS trixie
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Upgrading es2036.codfw.wmnet
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling restart_daemons on A:schema-codfw
* 10:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling restart_daemons on A:schema-codfw
* 10:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2037: repool after upgrade
* 10:52 moritzm: installing openssl security updates on bookworm
* 10:30 cgoubert@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] (duration: 07m 16s)
* 10:26 cgoubert@deploy1003: cgoubert: Continuing with deployment
* 10:25 cgoubert@deploy1003: cgoubert: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:23 cgoubert@deploy1003: Started scap sync-world: Backport for [[gerrit:1301341{{!}}Close API Portal wiki (T427537)]]
* 10:16 blake@deploy1003: Finished scap sync-world: apache config change ([[phab:T428772|T428772]]) (duration: 06m 41s)
* 10:12 blake@deploy1003: blake: Continuing with deployment
* 10:11 blake@deploy1003: blake: apache config change ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:10 blake@deploy1003: Started scap sync-world: apache config change ([[phab:T428772|T428772]])
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2037: repool after upgrade
* 10:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2037.codfw.wmnet with OS trixie
* 09:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 09:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 09:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:40 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-k8s # [[phab:T425377|T425377]]: populating translation memory (ttmserver-export.php) on eqiad-k8s (dblist: https://phabricator.wikimedia.org/P94120)
* 09:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2037.codfw.wmnet with reason: host reimage
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2037.codfw.wmnet with OS trixie
* 08:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2037.codfw.wmnet with OS trixie
* 08:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2037: Upgrading es2037.codfw.wmnet
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:45 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 08:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:43 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:41 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:40 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:35 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94117 and previous config saved to /var/cache/conftool/dbconfig/20260615-081440-fceratto.json
* 08:10 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] (duration: 20m 54s)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2047: Migration of es2047.codfw.wmnet completed
* 08:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94115 and previous config saved to /var/cache/conftool/dbconfig/20260615-080432-fceratto.json
* 08:03 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P94114 and previous config saved to /var/cache/conftool/dbconfig/20260615-075425-fceratto.json
* 07:53 atsuko@deploy1003: atsuko: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1301373{{!}}translate: production opensearch on k8s endpoints (T425377)]]
* 07:47 dcausse@deploy1003: mwscript-k8s job started: namespaceDupes cswiki --fix # [[phab:T428619|T428619]]
* 07:46 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] (duration: 34m 37s)
* 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94112 and previous config saved to /var/cache/conftool/dbconfig/20260615-074417-fceratto.json
* 07:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:33 dcausse@deploy1003: vadymts1, dcausse: Continuing with deployment
* 07:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:31 cwilliams@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on db-test2001.codfw.wmnet with reason: Testing
* 07:28 dcausse@deploy1003: vadymts1, dcausse: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:26 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:26 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:25 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:24 arnaudb@dns1005: END - running authdns-update
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1163 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P94110 and previous config saved to /var/cache/conftool/dbconfig/20260615-072446-fceratto.json
* 07:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 07:24 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2047: Migration of es2047.codfw.wmnet completed
* 07:23 arnaudb@dns1005: START - running authdns-update
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:21 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 elukey@cumin2002: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2047.codfw.wmnet with OS trixie
* 07:11 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1301675{{!}}Switch wmgUseCalendar to false for dewikivoyage (T429095)]], [[gerrit:1300301{{!}}Add alias namespace for cswiki (T428619)]]
* 07:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:53 moritzm: imported zookeeper 3.4.13-6+wmf12u1 to component/zookeeper34 for bookworm-wikimedia [[phab:T428495|T428495]]
* 06:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2047.codfw.wmnet with reason: host reimage
* 06:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2047.codfw.wmnet with OS trixie
* 06:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2047: Upgrading es2047.codfw.wmnet
* 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 06:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:59 marostegui: install mariadb 10.11.18 on pc1 [[phab:T428861|T428861]]
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrading
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:56 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:49 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.18 [[phab:T428861|T428861]]
* 05:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94105 and previous config saved to /var/cache/conftool/dbconfig/20260615-053403-marostegui.json
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2046.codfw.wmnet with reason: cloning
* 05:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es2045.codfw.wmnet with reason: crash
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94104 and previous config saved to /var/cache/conftool/dbconfig/20260615-053041-marostegui.json
* 02:18 Amir1: making Dexbot a bot in cywiki ([[phab:T428927|T428927]])
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-14 ==
* 11:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 11:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 34s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-13 ==
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-12 ==
* 19:54 dwisehaupt@dns1004: END - running authdns-update
* 19:52 dwisehaupt@dns1004: START - running authdns-update
* 18:33 dwisehaupt@dns1006: END - running authdns-update
* 18:32 dwisehaupt@dns1006: START - running authdns-update
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:10 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:59 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:43 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] (duration: 11m 17s)
* 14:36 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 14:35 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:31 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1301371{{!}}Hotfix for T428620 (T428620)]]
* 14:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:22 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:04 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 12:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to drbd
* 11:40 moritzm: installing Linux 5.10.257 on Bullseye hosts
* 11:36 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:24 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:56 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:49 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:49 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:40 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:37 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:36 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 10:35 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 10:12 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 10:12 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/toolhub: apply
* 10:08 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:59 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 09:57 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.disable-merges (exit_code=0)
* 06:11 jmm@cumin2002: START - Cookbook sre.puppet.disable-merges
* 03:07 ryankemper: [[phab:T427951|T427951]] sorry, `[eqiad,codfw].mediawiki.page_html_content_change.rc0` (accidentally a word)
* 03:06 ryankemper: [[phab:T427951|T427951]] Deleted all 20 unused dev/test topics on kafka-jumbo (verified empty first); 2 (`[eqiad,codfw]page_html_content_change.rc0`) were immediately auto-recreated empty by a still-running `dse-k8s` enrichment consumer; awaiting owner confirmation before final re-delete
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 01m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:00 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
== 2026-06-11 ==
* 22:27 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 22:26 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 22:14 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:13 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 22:05 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] (duration: 30m 51s)
* 21:58 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases2003.codfw.wmnet with OS trixie
* 21:52 egardner@deploy1003: egardner: Continuing with deployment
* 21:51 egardner@deploy1003: egardner: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:34 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1300906{{!}}Restore MediaViewer toggle in Special:Preferences (T428742)]]
* 21:34 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:29 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] (duration: 09m 09s)
* 21:28 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases2003.codfw.wmnet with reason: host reimage
* 21:25 arlolra@deploy1003: arlolra: Continuing with deployment
* 21:22 arlolra@deploy1003: arlolra: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:20 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1300913{{!}}Avoid the escaping from nowiki processing (T398967)]]
* 21:07 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] (duration: 10m 43s)
* 21:06 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 21:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 21:00 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300911{{!}}hCaptcha: Enable for badlogin for all small wikis (T426875)]], [[gerrit:1300905{{!}}RadioRangeBallot: Fix strict mode issue (T428947)]]
* 20:51 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] (duration: 34m 10s)
* 20:39 jdrewniak@deploy1003: annet, jdrewniak: Continuing with deployment
* 20:35 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases2003.codfw.wmnet with OS trixie
* 20:34 jdrewniak@deploy1003: annet, jdrewniak: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 20:17 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1300842{{!}}Donor Delight Badge: Unify on "Remove badge" language across treatments (T427313)]], [[gerrit:1300843{{!}}[A11y] Donor Badge: Remove Badge button disappears too quickly (T428646)]], [[gerrit:1300896{{!}}Donor Delight Badge, styles: Amending to final design review feedback (T427313)]]
* 19:12 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:12 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:12 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:52 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] (duration: 08m 15s)
* 17:48 reedy@deploy1003: reedy: Continuing with deployment
* 17:46 reedy@deploy1003: reedy: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:44 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300865{{!}}UploadWizard.config.php: Fix cc-by-4.0-heirs msg issue (T428935 T405146)]]
* 17:26 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:25 blake@deploy1003: Scap cancelled without rolling back.
* 17:25 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:24 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:24 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:24 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:23 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:23 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:23 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:20 blake@deploy1003: blake: apache config update ([[phab:T428772|T428772]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:20 blake@deploy1003: Started scap sync-world: apache config update ([[phab:T428772|T428772]])
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Migration of db2212.codfw.wmnet completed
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1235: Migration of db1235.eqiad.wmnet completed
* 17:08 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:45 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 dzahn@dns1005: END - running authdns-update
* 16:42 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:41 dzahn@dns1005: START - running authdns-update
* 16:41 mutante: releases.wikimedia.org - switching backend from codfw to eqiad - releases1003 is now the source of rsync for uploaded releases files (use releases.discovery.wmnet to not have to think about it) - [[phab:T418299|T418299]]
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb2007.codfw.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts rdb1011.eqiad.wmnet
* 16:35 jiji@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2009.codfw.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Migration of db2212.codfw.wmnet completed
* 16:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2009.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1235: Migration of db1235.eqiad.wmnet completed
* 16:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS trixie
* 16:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1235.eqiad.wmnet with OS trixie
* 16:13 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:07 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 16:05 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 16:01 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 16:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:01 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 16:01 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 16:00 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:00 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 16:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
* 15:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:58 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 15:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:57 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 15:57 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 15:56 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2009.codfw.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1011.eqiad.wmnet
* 15:55 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:55 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2007.codfw.wmnet
* 15:54 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1235.eqiad.wmnet with reason: host reimage
* 15:54 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 15:53 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 15:40 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS trixie
* 15:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1235.eqiad.wmnet with OS trixie
* 15:36 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:32 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:30 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] (duration: 11m 29s)
* 15:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2212: Upgrading db2212.codfw.wmnet
* 15:26 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:26 cscott@deploy1003: cscott: Continuing with deployment
* 15:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1235: Upgrading db1235.eqiad.wmnet
* 15:25 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:21 cscott@deploy1003: cscott: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1300822{{!}}T428849: temporarily disable noisy warnings in HandleParsoidSectionLinks (T428849 T417530)]]
* 15:18 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:17 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 15:13 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 15:13 moritzm: installing libdbi-perl security updates
* 14:53 moritzm: installing Bind security updates (just client-side tools/libraries)
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 14:48 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 14:43 moritzm: installing Poppler security updates
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:33 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:32 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1234: Migration of db1234.eqiad.wmnet completed
* 14:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 14:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 14:00 Lucas_WMDE: UTC afternoon backport+config window done
* 13:58 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] (duration: 08m 12s)
* 13:57 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp5024.*
* 13:55 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5020.*
* 13:54 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 13:52 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1300733{{!}}stream: webrequest.page_view_stats.dev0 (T428725)]]
* 13:50 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs5004*<nowiki>}</nowiki> and A:liberica
* 13:50 slyngs: reloading liberica config on lvs5004
* 13:50 moritzm: installing openssl security updates
* 13:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
* 13:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1234: Migration of db1234.eqiad.wmnet completed
* 13:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS trixie
* 13:43 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] (duration: 07m 19s)
* 13:39 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:38 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1298890{{!}}Add 2FA enforcement demotion config for phase 3 groups (T423120)]]
* 13:36 slyngshede@dns1004: END - running authdns-update
* 13:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1234.eqiad.wmnet with OS trixie
* 13:34 moritzm: installing dovecot security updates
* 13:34 slyngshede@dns1004: START - running authdns-update
* 13:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:32 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] (duration: 06m 59s)
* 13:29 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:29 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:28 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:28 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:27 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:25 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300787{{!}}hCaptcha: Enable for MobileFrontend on all group1 wikis (T425940)]]
* 13:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:24 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] (duration: 06m 51s)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
* 13:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:16 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300736{{!}}fix: correct intake-url and payload type for NCS experiment events (T422295)]]
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
* 13:14 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/Android_FAQ 'Wikimedia Apps/FAQ/Android' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:13 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 08m 47s)
* 13:13 andrewbogott: sudo -i reprepro --noskipold --component thirdparty/openstack-trixie-flamingo-backports update trixie-wikimedia
* 13:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1234.eqiad.wmnet with reason: host reimage
* 13:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:12 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki '--reason=per [[:phab:T428900]]' Wikimedia_Apps/iOS_FAQ 'Wikimedia Apps/FAQ/iOS' 'Martin Urbanec (WMF)' # [[phab:T428900|T428900]]
* 13:12 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:12 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:11 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:11 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 13:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 13:09 gkyziridis@deploy1003: gkyziridis: Continuing with deployment
* 13:06 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 claime: echo 'https://api.wikimedia.org/service/lw/specs/openapi.yaml' {{!}} mwscript-k8s --attach -- purgeList.php
* 13:04 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1300731{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 13:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS trixie
* 13:00 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:57 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1234.eqiad.wmnet with OS trixie
* 12:55 moritzm: installing Exim security updates on Bullseye
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5006
* 12:47 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5006
* 12:46 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5006
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5006.eqsin.wmnet 9.0.132.10.in-addr.arpa 9.0.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5006 - jmm@cumin2002"
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1234: Upgrading db1234.eqiad.wmnet
* 12:44 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2188: Migration of db2188.codfw.wmnet completed
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:29 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1232: Migration of db1232.eqiad.wmnet completed
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
* 12:28 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
* 12:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5006
* 12:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
* 12:21 moritzm: remove ganeti5006 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 12:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 12:10 moritzm: installing openjdk-21 security updates on Bookworm
* 12:03 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] (duration: 06m 53s)
* 11:59 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 11:58 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1300764{{!}}Remove GrowthExperiments extension from closed wikis (T428884)]]
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb1012.eqiad.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2010.codfw.wmnet
* 11:49 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:46 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb2008.codfw.wmnet
* 11:46 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2188: Migration of db2188.codfw.wmnet completed
* 11:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 11:43 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:43 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: rdb2010.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1232: Migration of db1232.eqiad.wmnet completed
* 11:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 11:37 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 11:36 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 11:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2188.codfw.wmnet with OS trixie
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb1012.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2008.codfw.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts rdb2010.codfw.wmnet
* 11:33 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 11:32 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 11:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1232.eqiad.wmnet with OS trixie
* 11:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 11:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] (duration: 08m 38s)
* 11:21 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 11:19 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:17 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300749{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300751{{!}}hCaptcha: Enable for DiscussionTools on all wikis (T426039)]]
* 11:15 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2188.codfw.wmnet with reason: host reimage
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:13 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 11:13 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:11 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 11:09 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1232.eqiad.wmnet with reason: host reimage
* 11:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:04 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 11:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1262.eqiad.wmnet with reason: crash
* 11:00 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 10:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:58 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2188.codfw.wmnet with OS trixie
* 10:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2188: Upgrading db2188.codfw.wmnet
* 10:52 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1232.eqiad.wmnet with OS trixie
* 10:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1232: Upgrading db1232.eqiad.wmnet
* 10:48 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:33 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:32 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:31 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] (duration: 11m 01s)
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:23 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:23 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:22 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:20 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1300734{{!}}HCaptcha: Return 'forceshowcaptcha' error when CAPTCHA forced (T426476)]], [[gerrit:1300727{{!}}hCaptcha: Enable for DiscussionTools on group 1 wikis (T426039)]]
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:10 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 10:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2045.codfw.wmnet with OS trixie
* 10:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2046', diff saved to https://phabricator.wikimedia.org/P94069 and previous config saved to /var/cache/conftool/dbconfig/20260611-100221-marostegui.json
* 10:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2046', diff saved to https://phabricator.wikimedia.org/P94068 and previous config saved to /var/cache/conftool/dbconfig/20260611-100145-marostegui.json
* 10:01 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:59 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] (duration: 15m 41s)
* 09:54 jiji@deploy1003: jiji: Continuing with deployment
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:45 jiji@deploy1003: jiji: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1300580{{!}}ProductionServices.php: switch filebackend.php back to rdb1013 (T291916 T419976)]]
* 09:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2045.codfw.wmnet with reason: host reimage
* 09:37 elukey: uploaded spicerack_12.8.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 09:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es2045.codfw.wmnet with OS bookworm
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2176: Migration of db2176.codfw.wmnet completed
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1219: Migration of db1219.eqiad.wmnet completed
* 09:11 claime: cumin -x 'A:swift-fe' "disable-puppet 'Disabling puppet for ratelimit deploy - cgoubert'"
* 08:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS bookworm
* 08:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2176: Migration of db2176.codfw.wmnet completed
* 08:34 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94055)
* 08:34 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1219: Migration of db1219.eqiad.wmnet completed
* 08:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94053)
* 08:30 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]] (duration: 01m 18s)
* 08:29 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T428823|T428823]]
* 08:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2176.codfw.wmnet with OS trixie
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1021: Migration to 10.11.17
* 08:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1021: Migration to 10.11.17
* 08:25 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94052)
* 08:24 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]] (duration: 01m 17s)
* 08:23 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): Testing upgrade for [[phab:T428823|T428823]]
* 08:22 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94051)
* 08:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1219.eqiad.wmnet with OS trixie
* 08:17 moritzm: installing PHP 8.2 security updates
* 08:15 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:14 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:11 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1013.eqiad.wmnet with OS trixie
* 08:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:06 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 08:06 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2021.codfw.wmnet,pc1021.eqiad.wmnet with reason: upgrade
* 08:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 08:05 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:04 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2176.codfw.wmnet with reason: host reimage
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 08:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1021: Migration to 10.11.17 [[phab:T427345|T427345]]
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 07:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1219.eqiad.wmnet with reason: host reimage
* 07:56 marostegui: install mariadb 10.11.17 on pc1 [[phab:T427345|T427345]]
* 07:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1013.eqiad.wmnet with reason: host reimage
* 07:49 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 07:47 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:47 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:46 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2176.codfw.wmnet with OS trixie
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1219.eqiad.wmnet with OS trixie
* 07:43 moritzm: imported Jenkins 2.541.3 for thirdparty/ci (Bullseye) and thirdparty/jenkins (Bookworm, Trixie)
* 07:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:35 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS trixie
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2176: Upgrading db2176.codfw.wmnet
* 07:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2176: Upgrading db2176.codfw.wmnet
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1219: Upgrading db1219.eqiad.wmnet
* 07:31 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 07:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:30 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 07:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1163: Repooling
* 07:19 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 06:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repool es2042', diff saved to https://phabricator.wikimedia.org/P94044 and previous config saved to /var/cache/conftool/dbconfig/20260611-065049-marostegui.json
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Depool es2042', diff saved to https://phabricator.wikimedia.org/P94043 and previous config saved to /var/cache/conftool/dbconfig/20260611-065027-marostegui.json
* 06:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1163: Repooling
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1163 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94041 and previous config saved to /var/cache/conftool/dbconfig/20260611-064319-fceratto.json
* 06:42 fceratto@dns1005: END - running authdns-update
* 06:40 fceratto@dns1005: START - running authdns-update
* 06:33 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:33 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:33 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1184 to s1 primary and set section read-write [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94040 and previous config saved to /var/cache/conftool/dbconfig/20260611-063323-fceratto.json
* 06:32 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94039 and previous config saved to /var/cache/conftool/dbconfig/20260611-063251-fceratto.json
* 06:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:32 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:32 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-write for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:31 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94037 and previous config saved to /var/cache/conftool/dbconfig/20260611-063100-fceratto.json
* 06:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:30 fceratto@cumin1003: MariaDB change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:30 fceratto@cumin1003: Dbctl change: Setting sections s1 as read-only for [[phab:T426083|T426083]]: 'Maintenance until 06:15 UTC'
* 06:29 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:29 federico3: Starting s1 eqiad failover from db1163 to db1184 - [[phab:T426083|T426083]]
* 06:22 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1184 with weight 0 [[phab:T426083|T426083]]', diff saved to https://phabricator.wikimedia.org/P94035 and previous config saved to /var/cache/conftool/dbconfig/20260611-062224-fceratto.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426083|T426083]]
* 05:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:28 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade gitlab
* 05:27 arnaudb@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:18 arnaudb@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade gitlab
* 05:17 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2045.codfw.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2045: Upgrading es2045.codfw.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:23 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2046.*
* 01:19 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1009.eqiad.wmnet with OS trixie
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:12 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:11 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:10 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:09 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:08 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:07 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:06 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:05 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:58 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1009.eqiad.wmnet with reason: host reimage
* 00:54 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:53 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:52 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:51 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1009
* 00:41 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1009
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1009.eqiad.wmnet 37.48.64.10.in-addr.arpa 7.3.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:41 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:40 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1009 - jasmine@cumin2002"
* 00:39 cdanis@cumin1003: dbctl commit (dc=all): 'depool db1262', diff saved to https://phabricator.wikimedia.org/P94032 and previous config saved to /var/cache/conftool/dbconfig/20260611-003950-cdanis.json
* 00:36 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:34 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5020.*
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1009
* 00:30 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1009.eqiad.wmnet with OS trixie
* 00:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5024.*
== 2026-06-10 ==
* 23:53 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
* 23:15 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] (duration: 11m 37s)
* 23:11 krinkle@deploy1003: krinkle: Continuing with deployment
* 23:06 krinkle@deploy1003: krinkle: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1300154{{!}}Disable ShortUrl on bdwikimedia, bhwiki, bnwiki, bnwikisource, eswikibooks, gomwiki (T107188)]]
* 22:57 ladsgroup@dns1004: END - running authdns-update
* 22:55 ladsgroup@dns1004: START - running authdns-update
* 22:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 22:13 mutante: gerrit - restarting service for logging change
* 22:11 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:10:00 on gerrit.wikimedia.org with reason: service restart
* 22:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: service restart
* 22:06 mutante: gerrit-spare: restarting gerrit
* 22:06 mutante: gerrit-replica: restarting gerrit
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 21:22 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] (duration: 08m 23s)
* 21:17 jforrester@deploy1003: jforrester: Continuing with deployment
* 21:15 jforrester@deploy1003: jforrester: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:13 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1300250{{!}}ExecuteTestAndCacheJob: Fix stdClasses serialised wrongly by JobQueue (T428801)]], [[gerrit:1300248{{!}}tests: Fix StandaloneHooksTest ordering, now broken by DB upgrade]]
* 21:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5024
* 21:02 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5024
* 21:02 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] (duration: 06m 51s)
* 21:00 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5024
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5024.eqsin.wmnet 35.0.132.10.in-addr.arpa 5.3.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:00 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:59 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5024 - brett@cumin2002"
* 20:57 catrope@deploy1003: catrope: Continuing with deployment
* 20:57 catrope@deploy1003: catrope: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300247{{!}}Revert "wgRestSandboxSpecs: Add Lift Wing API to documentation wikis" (T427902)]]
* 20:54 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:50 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5024
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 20:48 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 20:44 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] (duration: 11m 55s)
* 20:40 catrope@deploy1003: catrope, gkyziridis: Continuing with deployment
* 20:34 catrope@deploy1003: catrope, gkyziridis: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300073{{!}}wgRestSandboxSpecs: Add Lift Wing API to documentation wikis (T427902)]]
* 20:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 20:30 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] (duration: 09m 49s)
* 20:25 catrope@deploy1003: gergesshamon, catrope: Continuing with deployment
* 20:22 catrope@deploy1003: gergesshamon, catrope: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1300226{{!}}[arzwiki] Change the wordmark (T427720)]]
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 19:30 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:27 bblack@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=1) rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2046.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5020
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5020.eqsin.wmnet 24.0.132.10.in-addr.arpa 4.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:17 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5020 - brett@cumin2002"
* 19:14 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp2044.codfw.wmnet<nowiki>}</nowiki> and A:cp - testing {{Gerrit|1300236}} ()
* 19:11 brett@cumin2002: START - Cookbook sre.dns.netbox
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2174: Migration of db2174.codfw.wmnet completed
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 19:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:24 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5020
* 18:23 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 18:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2174: Migration of db2174.codfw.wmnet completed
* 18:20 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1218: Migration of db1218.eqiad.wmnet completed
* 18:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5018.*
* 18:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2174.codfw.wmnet with OS trixie
* 18:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1218.eqiad.wmnet with OS trixie
* 17:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:46 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2010.codfw.wmnet with OS trixie
* 17:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 17:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 17:44 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2174.codfw.wmnet with reason: host reimage
* 17:42 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1218.eqiad.wmnet with reason: host reimage
* 17:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94021)
* 17:29 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1218.eqiad.wmnet with OS trixie
* 17:26 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2174.codfw.wmnet with OS trixie
* 17:25 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1218: Upgrading db1218.eqiad.wmnet
* 17:24 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:24 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1218: Upgrading db1218.eqiad.wmnet
* 17:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2174: Upgrading db2174.codfw.wmnet
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:23 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2010.codfw.wmnet with reason: host reimage
* 17:23 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2174: Upgrading db2174.codfw.wmnet
* 17:22 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-upload and not P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:22 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:22 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 17:22 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on A:cp-text and not P<nowiki>{</nowiki>cp7008*<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 17:21 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:21 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:20 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:19 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:18 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:17 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 17:15 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:14 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:13 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 17:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1206: Migration of db1206.eqiad.wmnet completed
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2010
* 17:02 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2010
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2010.codfw.wmnet 35.48.192.10.in-addr.arpa 5.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:02 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 17:01 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2010 - jasmine@cumin2002"
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2010
* 16:50 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS trixie
* 16:41 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 16:39 bblack@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:39 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 16:34 bblack@cumin1003: START - Cookbook sre.cdn.roll-upgrade-varnish rolling upgrade of Varnish on P<nowiki>{</nowiki>cp7008.magru.wmnet<nowiki>}</nowiki> and A:cp - Upgrade wmfuniq to 0.3.0 ()
* 16:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 16:22 hnowlan@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 16:20 hnowlan@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 16:17 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1206: Migration of db1206.eqiad.wmnet completed
* 16:15 hnowlan@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 16:15 hnowlan@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 16:14 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:12 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:11 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1206.eqiad.wmnet with OS trixie
* 16:01 blblack: apt: uploaded libvmod-wmfuniq 0.3.0 for trixie
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:53 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 15:50 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:45 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1206.eqiad.wmnet with reason: host reimage
* 15:43 sukhe@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:42 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: no reason specified, no task ID specified]
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2173: Migration of db2173.codfw.wmnet completed
* 15:34 topranks: drain traffic through cr2-drmrs to reset pic0
* 15:33 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist mwscript.dblist extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release (dblist: https://phabricator.wikimedia.org/P94013)
* 15:30 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1206.eqiad.wmnet with OS trixie
* 15:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1206: Upgrading db1206.eqiad.wmnet
* 15:28 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1206: Upgrading db1206.eqiad.wmnet
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 15:25 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1009
* 15:24 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Harroyo-wmf out of all services on: 2436 hosts
* 15:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1009
* 15:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:20 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5018
* 15:19 brett@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5018
* 15:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:18 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cp5018
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: START - Cookbook sre.dns.wipe-cache cp5018.eqsin.wmnet 18.0.132.10.in-addr.arpa 8.1.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 15:18 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:15 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1195: Migration of db1195.eqiad.wmnet completed
* 15:12 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:11 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin1003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 15:08 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] (duration: 08m 39s)
* 15:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 15:01 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1300169{{!}}Fix snak value display for rtl languages (T360854)]], [[gerrit:1300168{{!}}Fix snak value display for rtl languages (T360854)]]
* 14:58 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:55 Lucas_WMDE: lucaswerkmeister-wmde@deploy1003 $ printf 'https://www.mediawiki.org/keys/%s\n' '' 'keys.txt' 'keys.html' {{!}} mwscript-k8s --attach --comment=[[phab:T423267|T423267]] purgeList mediawikiwiki
* 14:54 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with correct schema
* 14:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2173: Migration of db2173.codfw.wmnet completed
* 14:50 ayounsi@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:50 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2003.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:48 ayounsi@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - ayounsi@cumin1003
* 14:47 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] (duration: 08m 33s)
* 14:46 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Continuing with deployment
* 14:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2173.codfw.wmnet with OS trixie
* 14:40 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, matmarex: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:40 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1299614{{!}}Add my public key to mediawiki.org/keys (T423267)]]
* 14:38 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:34 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:33 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1195: Migration of db1195.eqiad.wmnet completed
* 14:28 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin[2002-2003].codfw.wmnet,cumin1003.eqiad.wmnet with reason: add new eqsin vlans as legacy temp workaround in wmf-plugin.py - cmooney@cumin1003
* 14:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 14:26 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 14:26 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 14:24 atsuko@deploy1003: mwscript-k8s job started: foreachwikiindblist translate extensions/Translate/scripts/ttmserver-export.php --ttmserver eqiad-test # [[phab:T425377|T425377]] populating ttmserver index on test cluster to estimate time required for the release, now with dblist translate
* 14:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 14:22 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 14:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 14:20 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2173.codfw.wmnet with reason: host reimage
* 14:20 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1195.eqiad.wmnet with OS trixie
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 14:15 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 14:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 14:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 14:10 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 14:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2173.codfw.wmnet with OS trixie
* 14:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 14:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 14:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 13:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2173: Upgrading db2173.codfw.wmnet
* 13:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2173: Upgrading db2173.codfw.wmnet
* 13:58 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:58 atsuko@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/ttmserver-export.php --wiki=default --ttmserver eqiad-test # [[phab:T425377|T425377]] populating production index on test cluster to estimate time required for the release
* 13:56 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1195.eqiad.wmnet with reason: host reimage
* 13:54 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Atieno out of all services on: 2436 hosts
* 13:42 Lucas_WMDE: UTC afternoon backport+config window done
* 13:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS trixie
* 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] (duration: 07m 20s)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1195: Upgrading db1195.eqiad.wmnet
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:33 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling restart_daemons on A:durum and A:durum
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2170: Migration of db2170.codfw.wmnet completed
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1195: Upgrading db1195.eqiad.wmnet
* 13:32 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:32 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Continuing with deployment
* 13:32 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 13:31 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 13:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, brett: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1297237{{!}}wmf-config: Update private subnets to include additions (T427393)]]
* 13:28 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5018.eqsin.wmnet with reason: host down
* 13:28 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:25 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet,service=(cdn{{!}}ats-be)
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
* 13:20 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling restart_daemons on A:durum and A:durum
* 13:20 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling restart_daemons on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:19 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] (duration: 17m 00s)
* 13:19 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1186: Migration of db1186.eqiad.wmnet completed
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:15 sbisson@deploy1003: sbisson, abi: Continuing with deployment
* 13:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling restart_daemons on A:tcpproxy and A:tcpproxy
* 13:05 sbisson@deploy1003: sbisson, abi: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1014.eqiad.wmnet with OS trixie
* 13:02 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299676{{!}}Enable ULS v2 on group0 wikis]]
* 12:47 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2170: Migration of db2170.codfw.wmnet completed
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:46 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:42 topranks: re-map DSCP AF41 from 'low' to 'normal' priority qos class on network [[phab:T424640|T424640]]
* 12:41 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1014.eqiad.wmnet with reason: host reimage
* 12:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS trixie
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1186: Migration of db1186.eqiad.wmnet completed
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1014
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb1014
* 12:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1186.eqiad.wmnet with OS trixie
* 12:21 jiji@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host rdb1014
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: START - Cookbook sre.dns.wipe-cache rdb1014.eqiad.wmnet 42.48.64.10.in-addr.arpa 2.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:21 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host rdb1014 - jiji@cumin1003"
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:16 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 12:13 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1014
* 12:12 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1014.eqiad.wmnet with OS trixie
* 12:12 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
* 12:08 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] (duration: 11m 06s)
* 12:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 12:03 reedy@deploy1003: reedy: Continuing with deployment
* 12:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1186.eqiad.wmnet with reason: host reimage
* 11:59 reedy@deploy1003: reedy: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes c
* 11:57 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1300104{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1300102{{!}}Mandatory2FAChecker: Allow getGroupsRequiring2FA() to work on implicit groups (T420792)]], [[gerrit:1299643{{!}}wmf-config: Add $wmgOATHAuthRequire2FAForAll config (T420792)]]
* 11:53 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS trixie
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ganeti5004
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti5004
* 11:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2170: Upgrading db2170.codfw.wmnet
* 11:49 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti5004
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ganeti5004.eqsin.wmnet 40.0.132.10.in-addr.arpa 0.4.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ganeti5004 - jmm@cumin2002"
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:48 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1186.eqiad.wmnet with OS trixie
* 11:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1186: Upgrading db1186.eqiad.wmnet
* 11:45 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:38 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.move-vlan for host ganeti5004
* 11:34 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 11:34 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
* 11:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:33 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
* 11:31 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:30 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:27 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:16 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:15 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
* 11:09 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:09 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
* 11:08 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] (duration: 06m 55s)
* 11:04 blake@deploy1003: blake: Continuing with deployment
* 11:04 blake@deploy1003: blake: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:03 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:01 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300092{{!}}ProductionServices: re-add poolcounter2006 (T426736)]]
* 10:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:57 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:57 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:56 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:56 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:56 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] (duration: 06m 42s)
* 10:51 blake@deploy1003: blake: Continuing with deployment
* 10:51 moritzm: remove ganeti5004 from eqsin cluster for reimage [[phab:T428229|T428229]]
* 10:51 blake@deploy1003: blake: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300087{{!}}ProductionServices: reboot poolcounter2006, re-add poolcounter 2005 (T426736)]]
* 10:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:45 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:43 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:43 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] (duration: 07m 38s)
* 10:41 moritzm: installing nginx security updates
* 10:38 blake@deploy1003: blake: Continuing with deployment
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: Security updates
* 10:38 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:38 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:38 root@cumin1003: START - Cookbook sre.mysql.pool pool db1152: Security updates
* 10:38 blake@deploy1003: blake: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:37 moritzm: failover Ganeti master in eqsin to ganeti5007 [[phab:T428229|T428229]]
* 10:35 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300082{{!}}ProductionServices: reboot poolcounter2005, re-add poolcounter 1007 (T426736)]]
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:34 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
* 10:29 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
* 10:29 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] (duration: 07m 45s)
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:27 jmm@cumin2002: DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sretest2009.codfw.wmnet: Renew puppet certificate - jmm@cumin2002
* 10:24 blake@deploy1003: blake: Continuing with deployment
* 10:23 blake@deploy1003: blake: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:21 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300072{{!}}ProductionServices: reboot poolcounter1007 (T426736)]]
* 10:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:20 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Security updates
* 10:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:14 root@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Security updates
* 10:13 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
* 10:12 blake@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] (duration: 07m 46s)
* 10:07 blake@deploy1003: blake: Continuing with deployment
* 10:06 blake@deploy1003: blake: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:04 blake@deploy1003: Started scap sync-world: Backport for [[gerrit:1300064{{!}}ProductionServices: reboot poolcounter1006.eqiad (T426736)]]
* 09:57 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] (duration: 09m 32s)
* 09:52 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1300058{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]], [[gerrit:1300059{{!}}SourceEditorOverlay: Show CAPTCHA panel when AF challenge closed (T425929)]]
* 09:35 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 09:34 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 09:32 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 09:32 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 09:26 moritzm: upgrade routinator in eqiad to 0.15.2 [[phab:T428456|T428456]]
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:23 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus5003.eqsin.wmnet to plain
* 09:15 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 09:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:29 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:09 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 08:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:05 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:04 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:01 fceratto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host db1215.eqiad.wmnet with OS trixie
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:41 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:40 moritzm: installing openssl security updates
* 07:39 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1215.eqiad.wmnet with reason: host reimage
* 07:38 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 07:37 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 07:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:29 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 14m 03s)
* 07:25 atsuko@deploy1003: atsuko: Continuing with deployment
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1215.eqiad.wmnet with OS trixie
* 07:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1215.eqiad.wmnet with reason: Reimage
* 07:21 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:20 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:20 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:17 atsuko@deploy1003: atsuko: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be veri
* 07:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:15 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1299556{{!}}ElasticSearchTtmServer: drop include_type_name and support int replicas (T428168)]], [[gerrit:1299561{{!}}ElasticSearchTtmServer: clean stale _doc usage and version error output (T428168)]], [[gerrit:1299529{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:12 atsukoito: backporting extensions/Translate to wmf/1.47.0-wmf.5 and applying the config
* 07:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 07:11 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 06:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 05:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 05:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 47s)
* 02:07 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1008.eqiad.wmnet with OS trixie
* 02:03 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 02:02 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:52 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:51 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:50 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 01:49 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 01:48 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 01:47 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 01:46 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 01:45 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 01:44 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 01:43 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 01:43 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1008.eqiad.wmnet with reason: host reimage
* 01:25 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main1008
* 01:24 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main1008
* 01:24 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main1008.eqiad.wmnet 45.32.64.10.in-addr.arpa 5.4.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:23 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:23 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main1008 - jasmine@cumin2002"
* 01:19 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 01:12 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main1008
* 01:11 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1008.eqiad.wmnet with OS trixie
* 01:00 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2009.codfw.wmnet with OS trixie
* 00:54 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 00:53 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 00:43 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:40 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:39 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 00:38 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2009.codfw.wmnet with reason: host reimage
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 00:38 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 00:37 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 00:36 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 00:35 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 00:34 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 00:33 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 00:32 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2009
* 00:15 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2009
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2009.codfw.wmnet 33.48.192.10.in-addr.arpa 3.3.0.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:15 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:15 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2009 - jasmine@cumin2002"
* 00:10 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2009
* 00:03 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2009.codfw.wmnet with OS trixie
== 2026-06-09 ==
* 22:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] (duration: 08m 59s)
* 22:45 cscott@deploy1003: cscott: Continuing with deployment
* 22:43 cscott@deploy1003: cscott: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:41 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299640{{!}}HandleSectionLinks: add temporary fallback to identify html headings (T428677)]]
* 22:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] (duration: 20m 57s)
* 22:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:07 mutante: gerrit - apache httpd log file location moved to /srv/gerrit/site_path/review_site/logs/ [[phab:T425667|T425667]]
* 22:06 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on gerrit2003.wikimedia.org with reason: debug
* 21:56 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299639{{!}}[Bug] Donor Badge: Remove client prefs for control group (T428501)]]
* 21:52 ryankemper: [[phab:T428241|T428241]] removed retired wdqs2009 full-graph journal dump (446G x2, ~892G) from clouddumps100[1-2]:/srv/dumps/xmldatadumps/public/other/wdqs
* 21:49 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] (duration: 08m 16s)
* 21:48 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:45 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 21:43 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:41 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1299602{{!}}Revert "Create VectorComponentPageToolbar component" (T428649)]]
* 21:34 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: debug
* 21:27 maryum: Deployed security fix for [[phab:T428324|T428324]]
* 21:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
* 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 21:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
* 20:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:50 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] (duration: 11m 13s)
* 20:46 cscott@deploy1003: cscott: Continuing with deployment
* 20:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs2002.codfw.wmnet with OS trixie
* 20:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:41 cscott@deploy1003: cscott: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:39 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299588{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T378906 T420336 T424427 T427664 T427972 T428452 T428270)]], [[gerrit:1299589{{!}}Bump wikimedia/parsoid to 0.24.0-a8 (T428270)]]
* 20:38 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:32 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] (duration: 22m 08s)
* 20:28 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:28 cscott@deploy1003: cscott, gkyziridis: Continuing with deployment
* 20:24 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 20:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 20:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 20:13 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:12 cscott@deploy1003: cscott, gkyziridis: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1299454{{!}}wgRestSandboxSpecs: Add lift-wing spec pointing to api.wikimedia.org (T427902)]]
* 20:09 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:59 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:48 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:28 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wdqs1015.eqiad.wmnet
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:28 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:27 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
* 19:20 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 19:15 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2008.codfw.wmnet with OS trixie
* 19:15 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts wdqs1015.eqiad.wmnet
* 19:12 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 19:12 jasmine@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 19:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:58 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:58 jasmine@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:57 jasmine@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:56 jasmine@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:55 jasmine@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 18:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:54 jasmine@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 18:53 jasmine@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2003 to codfw - jhancock@cumin2002"
* 18:52 jasmine@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 18:52 jasmine@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2008.codfw.wmnet with reason: host reimage
* 18:51 jasmine@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 18:51 jasmine@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 18:50 jasmine@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 18:47 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:47 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:31 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 18:29 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 18:26 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2008.codfw.wmnet with OS trixie
* 17:48 mutante: https://releases.wikimedia.org {{!}} https://releases-jenkins.wikimedia.org - down for maintenance [[phab:T418299|T418299]]
* 17:48 cmooney@dns2005: END - running authdns-update
* 17:47 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet with reason: reimage
* 17:47 cmooney@dns2005: START - running authdns-update
* 17:46 sukhe: sudo cumin 'A:hcaptcha-proxy' 'run-puppet-agent': rolling out CR {{Gerrit|1299427}} [[phab:T428539|T428539]]
* 17:43 jayme: kafka-main2008 is down due to hardware failure [[phab:T428654|T428654]]
* 17:32 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS trixie
* 17:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:06 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-main2008
* 17:05 jasmine@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-main2008
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-main2008
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:04 jasmine@cumin2002: START - Cookbook sre.dns.wipe-cache kafka-main2008.codfw.wmnet 4.32.192.10.in-addr.arpa 4.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:04 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:04 brett@cumin2002: START - Cookbook sre.hosts.move-vlan for host cp5018
* 17:04 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-main2008 - jasmine@cumin2002"
* 17:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 16:58 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:57 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 16:57 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:57 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS trixie
* 16:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:47 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
* 16:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 16:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 16:35 jasmine@cumin2002: START - Cookbook sre.hosts.move-vlan for host kafka-main2008
* 16:34 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2008.codfw.wmnet with OS trixie
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
* 16:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 16:23 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
* 16:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
* 16:20 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:13 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:12 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS trixie
* 16:10 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 16:09 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'sync'.
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS trixie
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 16:02 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 16:00 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 15:59 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:59 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:59 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 15:58 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/termbox: apply
* 15:57 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:57 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:57 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 15:56 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/termbox: apply
* 15:54 jiji@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:53 jiji@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 jiji@deploy1003: Finished scap sync-world: redeploy {{Gerrit|1299468}} (duration: 07m 23s)
* 15:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:47 jiji@deploy1003: jiji: Continuing with deployment
* 15:46 jiji@deploy1003: jiji: redeploy {{Gerrit|1299468}} synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
* 15:45 jiji@deploy1003: Started scap sync-world: redeploy {{Gerrit|1299468}}
* 15:43 brouberol@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-eqiad
* 15:34 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 40s)
* 15:33 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab1004 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:33 brennen@deploy1003: Finished deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt) (duration: 00m 45s)
* 15:32 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]] (duration: 07m 21s)
* 15:32 brennen@deploy1003: Started deploy [phabricator/deployment@73e57ce]: deploy phab2002 for [[phab:T410849|T410849]] (followup for robots.txt)
* 15:28 jiji@deploy1003: Rolling back deployment
* 15:27 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2002.codfw.wmnet with OS trixie
* 15:27 jiji@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 15:26 jiji@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 15:25 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1299468{{!}}ProductionServices.php: switch filebackend.php to rdb2015:6381 #2 (T418918 T291916)]]
* 15:22 urbanecm: Remove `migrateMentorStatusAwayToCommunityConfiguration` from updatelog on all wikis ([[phab:T409170|T409170]]; the script was only ever run as a dry-run)
* 15:21 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:21 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2001.codfw.wmnet with OS trixie
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]] (duration: 00m 42s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab1004 for [[phab:T410849|T410849]]
* 15:02 brennen@deploy1003: Finished deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]] (duration: 00m 45s)
* 15:01 brennen@deploy1003: Started deploy [phabricator/deployment@d244a3e]: deploy phab2002 for [[phab:T410849|T410849]]
* 14:58 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2001.codfw.wmnet with reason: host reimage
* 14:52 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab[2002-2003].codfw.wmnet,phab[1004-1006].eqiad.wmnet with reason: [[phab:T410849|T410849]]
* 14:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 14:46 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 14:40 moritzm: upgrade routinator in codfw to 0.15.2 [[phab:T428456|T428456]]
* 14:35 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:33 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc-wf2001.codfw.wmnet with OS trixie
* 14:26 brouberol@cumin1003: END (ERROR) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=97) rolling reboot on A:cephosd-eqiad
* 14:26 brouberol@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2153: Migration of db2153.codfw.wmnet completed
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:14 moritzm: imported routinator 0.15.2-1bookworm to thirdparty/routinator for bookworm-wikimedia [[phab:T428456|T428456]]
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1184: Migration of db1184.eqiad.wmnet completed
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 14:07 Dreamy_Jazz: Afternoon UTC backport window done
* 14:07 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] (duration: 06m 53s)
* 14:06 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: rack depool
* 14:03 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to drbd
* 14:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to drbd
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:02 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:59 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299495{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]], [[gerrit:1299502{{!}}SecurePollLogPager: Cast user IDs to ints before use (T428599)]]
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:58 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:56 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:55 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* {{safesubst:SAL entry|1=13:55 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497}}
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to drbd
* 13:50 cscott@deploy1003: cscott: Continuing with deployment
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2045.codfw.wmnet to cluster codfw and group A
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2027.codfw.wmnet to cluster codfw and group A
* 13:46 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:45 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:44 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:42 cscott@deploy1003: cscott: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}Store indicators}}
* 13:41 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* {{safesubst:SAL entry|1=13:40 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1298929{{!}}Simplify fragment processing (T423700)]], [[gerrit:1298926{{!}}Move ::getFragmentsToTransform() to Content<nowiki>{</nowiki>Text,DOM<nowiki>}</nowiki>TransformStage]], [[gerrit:1298927{{!}}OutputTransform: Rename DeduplicateStyles and ExpandToAbsoluteUrls stages]], [[gerrit:1298925{{!}}Reset DeduplicateStyles state between different pipeline executions (T428336 T428215)]], [[gerrit:1299497{{!}}}}
* 13:40 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 13:39 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:37 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:35 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:33 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 ayounsi@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 13:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] (duration: 07m 01s)
* 13:30 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2153: Migration of db2153.codfw.wmnet completed
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:28 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 lucaswerkmeister-wmde@deploy1003: mmartorana, lucaswerkmeister-wmde: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:26 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1184: Migration of db1184.eqiad.wmnet completed
* 13:25 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298834{{!}}config: Disable EmailConfirmationBanner on all wikis (T428291)]]
* 13:25 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:21 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2153.codfw.wmnet with OS trixie
* 13:20 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2241: rack depool
* 13:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1237: repool after maintenance db1237
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] (duration: 09m 40s)
* 13:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 13:17 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2251-2253].codfw.wmnet
* 13:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 13:16 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 13:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1184.eqiad.wmnet with OS trixie
* 13:14 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Continuing with deployment
* 13:11 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 13:11 lucaswerkmeister-wmde@deploy1003: neriah, lucaswerkmeister-wmde: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298654{{!}}Enable wgNewUserMessageOnFirstEdit on commonswiki (T426206)]]
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:04 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:58 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2153.codfw.wmnet with reason: host reimage
* 12:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 12:57 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:56 XioNoX: lsw1-a4-codfw> request system reboot - [[phab:T427357|T427357]]
* 12:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1184.eqiad.wmnet with reason: host reimage
* 12:50 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] (duration: 07m 21s)
* 12:46 kharlan@deploy1003: kharlan, dbrant: Continuing with deployment
* 12:46 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:45 kharlan@deploy1003: kharlan, dbrant: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:45 topranks: shut sub-interfaces for row A/B legacy vlans on cr1-codfw [[phab:T427357|T427357]]
* 12:45 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:43 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1299477{{!}}hCaptcha: Roll out to all wikis for api account creation. (T426050)]]
* 12:42 topranks: increase OSPF cost on ssw1-a1-codfw link to lsw1-a4-codfw to force traffic via alternate spine [[phab:T427357|T427357]]
* 12:41 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] (duration: 07m 02s)
* 12:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:40 moritzm: installing wireshark security updates
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2153.codfw.wmnet with OS trixie
* 12:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1184.eqiad.wmnet with OS trixie
* 12:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1237: repool after maintenance db1237
* 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1299478{{!}}STVFormatter: Cast strings to float before passing to round (T428584)]]
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2153: Upgrading db2153.codfw.wmnet
* 12:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1184: Upgrading db1184.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1237.eqiad.wmnet with OS trixie
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 12:32 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:29 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:27 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2046: repool after maintenance
* 12:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 12:23 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] (duration: 16m 04s)
* 12:23 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:22 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 12:20 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2251-2253].codfw.wmnet
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: rack depool
* 12:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2241: rack depool
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1016
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1016
* 12:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.move-vlan for host rdb1015
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 12:19 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
* 12:17 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.depool-rack (exit_code=99) with action 'depool' for codfw rack A4
* 12:17 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 24 hosts with reason: Rack A4 depool
* 12:16 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:15 topranks: drain traffic on ssw1-a1-codfw - add gshut community in evpn underlay - [[phab:T427357|T427357]]
* 12:14 ayounsi@cumin1003: START - Cookbook sre.network.depool-rack with action 'depool' for codfw rack A4
* 12:13 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298829{{!}}wmf-config: Enable hCaptcha on UploadWizard publish for testwiki (T426126)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1237.eqiad.wmnet with reason: host reimage
* 12:00 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Dmaza out of all services on: 2435 hosts
* 11:51 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:51 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 11:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:48 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:45 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:44 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:43 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2046: repool after maintenance
* 11:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:36 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2046.codfw.wmnet with OS trixie
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2185.codfw.wmnet with reason: Reimage
* 11:31 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging HMonroy out of all services on: 2435 hosts
* 11:28 root@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging KSiebert out of all services on: 2435 hosts
* 11:26 slyngs: CAS-SSO upgrade to version 7.3.7.2
* 11:26 slyngshede@dns1004: END - running authdns-update
* 11:24 slyngshede@dns1004: START - running authdns-update
* 11:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1043: repool after upgrade
* 11:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2046.codfw.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2046.codfw.wmnet with OS trixie
* 10:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2046: Upgrading es2046.codfw.wmnet
* 10:53 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2046: Upgrading es2046.codfw.wmnet
* 10:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:52 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 10:51 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1043: repool after upgrade
* 10:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1160: Repooling
* 10:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1043.eqiad.wmnet with OS trixie
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:17 elukey: complete rollout of apache2 upgrades
* 10:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:15 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1043.eqiad.wmnet with reason: host reimage
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:04 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 09:51 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:51 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:50 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1043.eqiad.wmnet with OS trixie
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1043: Upgrading es1043.eqiad.wmnet
* 09:48 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:47 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:45 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 09:36 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose --last-checked="20260603"` (after stopping previous scan run)
* 09:34 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=5 --verbose` (after stopping previous scan run)
* 09:27 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 09:26 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 09:17 fceratto@cumin1003: MariaDB change: Setting sections s5 as read-write
* 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1043: Upgrading es1043.eqiad.wmnet
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:12 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1042 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93943 and previous config saved to /var/cache/conftool/dbconfig/20260609-091215-marostegui.json
* 09:11 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1043 to es4 eqiad primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93942 and previous config saved to /var/cache/conftool/dbconfig/20260609-091147-marostegui.json
* 09:03 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:59 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 08:55 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2005.codfw.wmnet
* 08:55 jiji@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:50 jiji@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=registry2004.codfw.wmnet
* 08:22 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=codfw
* 08:22 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=docker-registry,name=eqiad
* 08:08 jiji@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=docker-registry,name=codfw
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:59 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: fix typoes - ayounsi@cumin1003"
* 07:52 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:47 brouberol@dns1004: END - running authdns-update
* 07:46 brouberol@dns1004: START - running authdns-update
* 07:44 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:43 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:42 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:41 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:39 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-ui: apply
* 07:38 brouberol@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 brouberol@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 07:36 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:36 brouberol@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:26 fceratto@dns1004: END - running authdns-update
* 07:24 fceratto@dns1004: START - running authdns-update
* 07:22 marostegui@dns1004: END - running authdns-update
* 07:21 marostegui@dns1004: START - running authdns-update
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:19 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:19 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix dse-k8s-wdqs2002 duplicate ipv6 address - elukey@cumin1003"
* 07:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 07:12 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:11 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1160: Repooling
* 07:11 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1160: Repooling
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:00 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1237.eqiad.wmnet with OS trixie
* 06:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1160 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93940 and previous config saved to /var/cache/conftool/dbconfig/20260609-062412-fceratto.json
* 06:17 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:16 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:15 cscott@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:14 cscott@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:12 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1244 to s4 primary and set section read-write [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93939 and previous config saved to /var/cache/conftool/dbconfig/20260609-061222-fceratto.json
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93938 and previous config saved to /var/cache/conftool/dbconfig/20260609-061131-fceratto.json
* 06:10 federico3: Starting s4 eqiad failover from db1160 to db1244 - [[phab:T426086|T426086]]
* 06:01 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1244 with weight 0 [[phab:T426086|T426086]]', diff saved to https://phabricator.wikimedia.org/P93937 and previous config saved to /var/cache/conftool/dbconfig/20260609-060121-fceratto.json
* 06:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426086|T426086]]
* 05:40 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1237.eqiad.wmnet with OS trixie
* 05:37 marostegui@dns1004: START - running authdns-update
* 05:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1237: Upgrading db1237.eqiad.wmnet
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1237 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93935 and previous config saved to /var/cache/conftool/dbconfig/20260609-052420-marostegui.json
* 05:23 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1220 to x1 primary and set section read-write [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93934 and previous config saved to /var/cache/conftool/dbconfig/20260609-052311-marostegui.json
* 05:22 marostegui@cumin1003: dbctl commit (dc=all): 'Set x1 eqiad as read-only for maintenance - [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93933 and previous config saved to /var/cache/conftool/dbconfig/20260609-052253-marostegui.json
* 05:22 marostegui: Starting x1 eqiad failover from db1237 to db1220 - [[phab:T428158|T428158]]
* 05:19 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1220 with weight 0 [[phab:T428158|T428158]]', diff saved to https://phabricator.wikimedia.org/P93932 and previous config saved to /var/cache/conftool/dbconfig/20260609-051859-marostegui.json
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428158|T428158]]
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.3 (duration: 02m 43s)
* 03:40 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]] (duration: 37m 16s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.6 refs [[phab:T423915|T423915]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-06-08 ==
* 22:00 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] (duration: 07m 42s)
* 21:56 reedy@deploy1003: reedy: Continuing with deployment
* 21:54 reedy@deploy1003: reedy: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1298915{{!}}CommonSettings: Set $wgScoreSafeMode = false (T428484)]]
* 21:12 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] (duration: 08m 10s)
* 21:07 mlitn@deploy1003: mlitn, neriah: Continuing with deployment
* 21:05 mlitn@deploy1003: mlitn, neriah: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298891{{!}}OOUIHTMLForm: Avoid treating form header as a clickable label (T428359)]]
* 20:43 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] (duration: 07m 05s)
* 20:39 mlitn@deploy1003: mlitn: Continuing with deployment
* 20:38 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:36 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297162{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias]], [[gerrit:1298841{{!}}Squashed diff to master]]
* 20:29 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] (duration: 08m 58s)
* 20:25 mlitn@deploy1003: mlitn, vadymts1: Continuing with deployment
* 20:22 mlitn@deploy1003: mlitn, vadymts1: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:20 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1298390{{!}}English Wikibooks: update FlaggedRevs configuration (T428329)]], [[gerrit:1298328{{!}}English Wikiversity: Add new user group "autopatrolled" (T428269)]]
* 20:03 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] (duration: 37m 43s)
* 19:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:31 kharlan@deploy1003: kharlan: Continuing with deployment
* 19:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:29 kharlan@deploy1003: kharlan: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1298879{{!}}SimpleCaptcha: Re-render captcha when edit form is redisplayed (T428437)]]
* 19:24 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 32s)
* 19:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:22 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:20 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab (duration: 01m 40s)
* 19:19 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab
* 19:16 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2002.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs2001.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 18:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2004
* 18:52 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2003
* 18:52 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2003
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2004 to codfw - jhancock@cumin2002"
* 18:44 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2030 to codfw - jhancock@cumin2002"
* 18:37 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:33 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2002
* 18:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2002
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs2002 to codfw - jhancock@cumin2002"
* 18:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs2001
* 18:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs2001
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating dse-k8s-wdqs2001 to codfw - jhancock@cumin2002"
* 18:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 18:02 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]] (duration: 00m 12s)
* 18:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T427286|T427286]]
* 17:37 jnuche@deploy1003: Installation of scap version "4.268.0" completed for 2 hosts
* 17:35 jnuche@deploy1003: Installing scap version "4.268.0" for 2 host(s)
* 17:21 claime: restarting varnish-frontend service on cp6012
* 17:21 claime: restarting varnish-frontend service on cp6011
* 17:21 claime: restarted varnish-frontend service on cp6009
* 17:13 taavi: bounce sirenbot to get it to re-join a channel
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:05 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:58 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 16:57 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 16:55 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 16:53 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 16:52 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 16:30 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 16:29 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:28 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 16:27 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 16:26 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 16:25 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 16:18 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 16:17 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:16 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 16:15 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 16:14 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 16:13 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 16:12 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 16:10 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:09 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 16:08 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 16:07 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 16:06 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 15:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2042: repool after upgrade
* 15:45 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2183-2184].codfw.wmnet
* 15:45 jynus@cumin2002: START - Cookbook sre.hosts.remove-downtime for db[2183-2184].codfw.wmnet
* 15:18 jynus: dbmaint on backup1-codfw@codfw ([[phab:T428467|T428467]])
* 15:12 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2042: repool after upgrade
* 15:12 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 15:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:09 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2042.codfw.wmnet with OS trixie
* 15:04 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:03 jynus@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2183-2184].codfw.wmnet with reason: Switchover db
* 15:03 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:01 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 15:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 14:59 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:54 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:49 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2042.codfw.wmnet with reason: host reimage
* 14:32 Lucas_WMDE: UTC afternoon backport+config window done
* 14:32 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] (duration: 31m 57s)
* 14:27 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:26 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2042.codfw.wmnet with OS trixie
* 14:26 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2042: Upgrading es2042.codfw.wmnet
* 14:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:24 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2043 to es4 codfw primary [[phab:T428386|T428386]]', diff saved to https://phabricator.wikimedia.org/P93926 and previous config saved to /var/cache/conftool/dbconfig/20260608-142423-marostegui.json
* 14:23 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1041: repool after maintenance
* 14:19 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 14:18 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1298709{{!}}Add translatable messages for WikiProject names (T427804)]], [[gerrit:1298710{{!}}Use translatable messages for WikiProject links (T427804)]], [[gerrit:1297644{{!}}WikiProject links - remove 'text' config (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 cgoubert@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=liftwing-openapi-server.*
* 14:10 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp6013.*
* 14:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 14:05 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] (duration: 08m 31s)
* 13:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:43 cgoubert@dns1004: END - running authdns-update
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 cgoubert@dns1004: START - running authdns-update
* 13:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296550{{!}}hCaptcha: Don't show AbuseFilter CAPTCHA for wbsetclaim API (T427608)]]
* 13:39 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show exp}}
* 13:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1041: repool after maintenance
* 13:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'liftwing-openapi-server' for release 'main' .
* 13:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:37 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 13:36 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 13:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1041.eqiad.wmnet with OS trixie
* 13:34 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 13:34 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2041: repool after upgrade
* 13:34 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Continuing with deployment
* 13:34 urbanecm@deploy1003: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 13:32 urbanecm@deploy1003: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* {{safesubst:SAL entry|1=13:30 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show}}
* {{safesubst:SAL entry|1=13:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298758{{!}}feat(V2): toggle experiment features based on custom url override (T424646)]], [[gerrit:1298762{{!}}specialCreateAccount: use GECreateAccountExperimentV2 instead of hook (T424646)]], [[gerrit:1298764{{!}}fix: correctly read experiments param on Special:UserLogin]], [[gerrit:1298765{{!}}signup.js: use JS var instead of TestKitchen to show expe}}
* 13:21 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] (duration: 11m 06s)
* 13:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:17 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:12 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 13:12 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki
* 13:12 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 13:12 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1041.eqiad.wmnet with reason: host reimage
* 13:11 kamila@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:11 kamila@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1298418{{!}}NewUserMessage: Add $wgNewUserMessageOnAutoCreateFirstEdit (T426206)]], [[gerrit:1298717{{!}}Replace NewUserMessageOnAutoCreateFirstEdit with wgNewUserMessageOnFirstEdit (T426206)]], [[gerrit:1298734{{!}}Enable wgNewUserMessageOnFirstEdit on incubatorwiki (T426206)]]
* 12:57 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] (duration: 06m 20s)
* 12:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1041.eqiad.wmnet with OS trixie
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1041: Upgrading es1041.eqiad.wmnet
* 12:55 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:54 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:53 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:51 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1298767{{!}}Follow-up: Allow CaptchaConsequence to be skipped via hook (T427608)]]
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2041: repool after upgrade
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2063.codfw.wmnet with OS bullseye
* 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2062.codfw.wmnet with OS bullseye
* 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2041.codfw.wmnet with OS trixie
* 12:21 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:19 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (thin): Regular analytics weekly train THIN [analytics/refinery@d67c584f]
* 12:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 12:16 joal@deploy1003: Finished deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f] (duration: 07m 52s)
* 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2063.codfw.wmnet with reason: host reimage
* 12:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:08 joal@deploy1003: Started deploy [analytics/refinery@d67c584]: Regular analytics weekly train [analytics/refinery@d67c584f]
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2062.codfw.wmnet with reason: host reimage
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:06 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add eqiad e8 public vlans - ayounsi@cumin1003"
* 12:03 joal@deploy1003: Finished deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f] (duration: 02m 00s)
* 12:03 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2041.codfw.wmnet with reason: host reimage
* 12:01 joal@deploy1003: Started deploy [analytics/refinery@d67c584] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d67c584f]
* 12:01 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 12:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:00 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 12:00 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2063
* 11:57 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2063
* 11:57 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2063.codfw.wmnet 52.16.192.10.in-addr.arpa 2.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:56 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:56 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2063 - mvernon@cumin2002"
* 11:51 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:51 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2063
* 11:50 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2063.codfw.wmnet with OS bullseye
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2062
* 11:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2062
* 11:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2062
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2062.codfw.wmnet 123.0.192.10.in-addr.arpa 3.2.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2062 - mvernon@cumin2002"
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS trixie
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2041: Upgrading es2041.codfw.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2041: Upgrading es2041.codfw.wmnet
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.major-upgrade (exit_code=97)
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: repool after maintenance
* 11:43 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:43 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2062
* 11:42 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2062.codfw.wmnet with OS bullseye
* 11:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] (duration: 17m 39s)
* 11:25 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:18 Raine: progressively switching shellbox to bookworm (start)
* 11:15 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 11:14 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 kamila@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 11:12 kamila@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 11:12 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298728{{!}}SpecialMediaSearch: Prefer thumb steps over thumb limits (T424032)]]
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2062
* 11:02 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2063
* 10:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1042: repool after maintenance
* 10:58 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1042.eqiad.wmnet with OS trixie
* 10:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] (duration: 16m 41s)
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:39 kamila@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 10:38 kamila@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 10:36 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2160.codfw.wmnet
* 10:36 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2160.codfw.wmnet
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2043: repool after upgrade
* 10:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2160.codfw.wmnet with reason: Reboot
* 10:34 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1042.eqiad.wmnet with reason: host reimage
* 10:30 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1298721{{!}}GuessedThumbnailInfo: Also allow showing webp originals (T428202)]]
* 10:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1042.eqiad.wmnet with OS trixie
* 10:18 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:18 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:16 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1042: Upgrading es1042.eqiad.wmnet
* 10:14 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:14 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1042: Upgrading es1042.eqiad.wmnet
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:12 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2063
* 10:09 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2062
* 10:07 ihurbain@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:07 ihurbain@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:06 ihurbain@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 09:52 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 09:50 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 09:49 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 09:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2043: repool after upgrade
* 09:49 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2043.codfw.wmnet with OS trixie
* 09:44 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 09:44 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 09:42 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:42 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: sync
* 09:41 ozge@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: sync
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:27 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 09:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2043.codfw.wmnet with reason: host reimage
* 09:17 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 09:15 ozge@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: sync
* 09:15 ozge@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: sync
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2043.codfw.wmnet with OS trixie
* 09:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2043: Upgrading es2043.codfw.wmnet
* 09:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2043: Upgrading es2043.codfw.wmnet
* 09:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1217.eqiad.wmnet with OS trixie
* 08:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 08:14 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2052: repool after upgrade
* 08:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1051: repool after maintenance
* 08:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis urwikisource in section s5
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1217.eqiad.wmnet with OS trixie
* 07:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1217.eqiad.wmnet with reason: reimage
* 07:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:52 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.sanitize-wiki (exit_code=97) Managing sanitization for wikis urwikisource in section s5
* 07:50 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 07:44 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] (duration: 32m 51s)
* 07:32 wmde-fisch@deploy1003: wmde-fisch, lilients: Continuing with deployment
* 07:29 wmde-fisch@deploy1003: wmde-fisch, lilients: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:21 elukey: upgrade sudo package on an-* hosts for [[phab:T428384|T428384]]
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2052: repool after upgrade
* 07:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1051: repool after maintenance
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:12 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database urwikisource ([[phab:T415977|T415977]])
* 07:12 elukey: upgrade exim4 packages on seaborgium for security upgrades
* 07:11 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1297681{{!}}Global rollout - Sub-ref deployments to Group 0, Group 1 and frwiki (T425662)]]
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1051.eqiad.wmnet with OS trixie
* 06:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1051.eqiad.wmnet with reason: host reimage
* 06:15 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database urwikisource ([[phab:T415977|T415977]])
* 05:58 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1051.eqiad.wmnet with OS trixie
* 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2052.codfw.wmnet with OS trixie
* 05:44 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool es1051: Upgrading es1051.eqiad.wmnet
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2052.codfw.wmnet with reason: host reimage
* 05:35 marostegui@dns1004: END - running authdns-update
* 05:34 marostegui@dns1004: START - running authdns-update
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1051: Upgrading es1051.eqiad.wmnet
* 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:31 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1054 to es3 eqiad primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93895 and previous config saved to /var/cache/conftool/dbconfig/20260608-053156-marostegui.json
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2052.codfw.wmnet with OS trixie
* 05:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2052: Upgrading es2052.codfw.wmnet
* 05:18 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-07 ==
* 16:32 elukey: `elukey@cumin1003:~$ sudo cumin 'cp6* and not cp6014* and not cp6010*' "varnish-frontend-restart" -b 1`
* 16:29 elukey: restart varnish-frontend on cp6014
== 2026-06-06 ==
* 09:07 ammarpad@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=hewiki --logwiki=metawiki W.Mechelke Tungsten_Mechelke # [[phab:T428182|T428182]]
== 2026-06-05 ==
* 22:16 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 22:15 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 21:01 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=10 --verbose` (after stopping the other commons scan)
* 20:56 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --verbose` (after stopping the other commons scan)
* 20:20 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] (duration: 10m 02s)
* 20:16 krinkle@deploy1003: krinkle: Continuing with deployment
* 20:12 krinkle@deploy1003: krinkle: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1290093{{!}}Enable wmgUseUrlShortenerLegacy on test2wiki (T107188)]]
* 16:45 jgreen@dns1004: END - running authdns-update
* 16:44 jgreen@dns1004: START - running authdns-update
* 16:17 dzahn@dns1005: END - running authdns-update
* 16:17 mutante: DNS - adding new project language "mag" - Magahi - a language spoken in India and Nepal by about 12 million native speakers ([[phab:T428266|T428266]])
* 16:16 dzahn@dns1005: START - running authdns-update
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:30 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:30 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2202.codfw.wmnet with reason: Reboot
* 12:28 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:07 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:06 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:55 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:54 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:31 ozge@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1054: repool after upgrade
* 08:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 08:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1054: repool after upgrade
* 07:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 07:07 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1054.eqiad.wmnet with OS trixie
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1054.eqiad.wmnet with reason: host reimage
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1054.eqiad.wmnet with OS trixie
* 05:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1054: Upgrading es1054.eqiad.wmnet
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1054: Upgrading es1054.eqiad.wmnet
* 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 01:55 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1010.eqiad.wmnet with OS trixie
* 01:39 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:32 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1010.eqiad.wmnet with reason: host reimage
* 01:16 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1010.eqiad.wmnet with OS trixie
* 00:56 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:33 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1007.eqiad.wmnet with reason: host reimage
* 00:17 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1007.eqiad.wmnet with OS trixie
* 00:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] (duration: 07m 02s)
== 2026-06-04 ==
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Continuing with deployment
* 23:57 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1006.eqiad.wmnet with OS trixie
* 23:57 ladsgroup@deploy1003: ladsgroup, pppery: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:55 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1297268{{!}}Redirect unknown wikinews languages to portal (T427126)]]
* 23:40 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:36 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage
* 23:20 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS trixie
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host releases1003.eqiad.wmnet with OS trixie
* 21:04 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:58 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on releases1003.eqiad.wmnet with reason: host reimage
* 20:50 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:42 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host releases1003.eqiad.wmnet with OS trixie
* 20:27 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn{{!}}ats-be)
* 20:26 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 20:20 brett@dns1006: END - running authdns-update
* 20:19 brett@dns1006: START - running authdns-update
* 20:18 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 20:10 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] (duration: 07m 39s)
* 20:08 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist group2.dblist extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 20:06 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:04 arlolra@deploy1003: arlolra: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:02 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1296015{{!}}Deploy PRV to 6 wikis (T427851)]]
* 19:49 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:43 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cp5030
* 19:15 cmooney@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5030
* 19:14 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cp5030
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache cp5030.eqsin.wmnet 27.0.132.10.in-addr.arpa 7.2.0.0.0.0.0.0.2.3.1.0.0.1.0.0.1.0.1.0.0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa on all recursors
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cp5030 - cmooney@cumin1003"
* 19:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.move-vlan for host cp5030
* 19:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:51 cmooney@dns2005: END - running authdns-update
* 18:50 cmooney@dns2005: START - running authdns-update
* 18:43 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:42 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:40 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for eqsin cr links - cmooney@cumin1003"
* 18:37 sukhe: sukhe@cp6013:~$ sudo traffic_server -C clear_cache
* 18:36 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 17:17 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 06m 40s)
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:13 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:11 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297751{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297752{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:55 topranks: shift traffic off cr1-esams et-1/0/1 link to asw1-by27-esams [[phab:T427056|T427056]]
* 16:45 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] (duration: 13m 58s)
* 16:41 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 16:33 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297741{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]], [[gerrit:1297742{{!}}hCaptcha: Update MF interface name for instrumentation (T428178)]]
* 16:17 ozge@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:03 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] (duration: 10m 21s)
* 16:03 elukey: uploaded spicerack_12.7.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 15:59 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:55 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297740{{!}}hCaptcha: Move ConfirmEditCaptchaClass hook inside hCaptcha block (T428183)]]
* 15:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5030.*
* 15:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2007.codfw.wmnet with OS trixie
* 15:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:24 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] (duration: 07m 26s)
* 15:24 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:20 sbisson@deploy1003: sbisson: Continuing with deployment
* 15:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:19 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2007.codfw.wmnet with reason: host reimage
* 15:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1297730{{!}}ptwiki: Disable Article Guidance experiment (T426871)]]
* 15:13 ladsgroup@cumin1003: END (PASS) - Cookbook sre.wikireplicas.update-views (exit_code=0)
* 15:06 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] (duration: 07m 00s)
* 15:05 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 15:02 zabe@deploy1003: zabe: Continuing with deployment
* 15:01 zabe@deploy1003: zabe: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1297724{{!}}Revert "Start reading from new file tables on commons"]]
* 14:57 zabe@deploy1003: Finished scap sync-world: [[phab:T416548|T416548]] (duration: 05m 10s)
* 14:56 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-main2007.codfw.wmnet with OS trixie
* 14:52 zabe@deploy1003: Started scap sync-world: [[phab:T416548|T416548]]
* 14:50 btullis@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 btullis@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:43 zabe@deploy1003: sync-world aborted: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] (duration: 03m 58s)
* 14:43 zabe@deploy1003: zabe: Continuing with deployment
* 14:41 zabe@deploy1003: zabe: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f1-codfw
* 14:40 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f1-codfw
* 14:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1270513{{!}}Start reading from new file tables on commons (T416548)]]
* 14:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] (duration: 08m 20s)
* 14:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:30 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1057: repool after upgrade
* 14:28 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297711{{!}}hCaptcha: Enable for MobileFrontend in some Group 2 wikis (T425940)]]
* 14:20 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:16 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:16 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] (duration: 06m 46s)
* 14:10 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:08 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 14:06 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297704{{!}}Use the globalblock-local-status right over globalblock-whitelist (T277942)]], [[gerrit:1296620{{!}}core-Permissions: Stop assigning unused globalblock-whitelist right (T277942)]]
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:06 tappof: bump space for prometheus k8s-aux in eqiad
* 14:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 14:05 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 14:04 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 13:56 _joe_: transferred requestctl api tokens for all ops to the db ([[phab:T428119|T428119]])
* 13:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2050 to es3 codfw primary [[phab:T428050|T428050]]', diff saved to https://phabricator.wikimedia.org/P93878 and previous config saved to /var/cache/conftool/dbconfig/20260604-135631-marostegui.json
* 13:56 Dreamy_Jazz: Afternoon UTC backport window done
* 13:54 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] (duration: 13m 38s)
* 13:51 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:50 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 13:47 sukhe: sukhe@cp6011:~$ sudo -i varnish-frontend-restart
* 13:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1057: repool after upgrade
* 13:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:43 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1057.eqiad.wmnet with OS trixie
* 13:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297700{{!}}Revert "hCaptcha: Provide always challenge sitekey for account creation"]]
* 13:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] (duration: 05m 27s)
* 13:38 dreamyjazz@deploy1003: dreamyjazz: Rolling back deployment
* 13:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297692{{!}}hCaptcha: Provide always challenge sitekey for account creation (T421041)]]
* 13:31 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] (duration: 17m 13s)
* 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Continuing with deployment
* 13:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1057.eqiad.wmnet with reason: host reimage
* 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, audreypenven: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295978{{!}}Update config for WikiProjects linking prototype (T427804)]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1220: Migration of db1220.eqiad.wmnet completed
* 13:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: down
* 13:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1224', diff saved to https://phabricator.wikimedia.org/P93875 and previous config saved to /var/cache/conftool/dbconfig/20260604-131219-marostegui.json
* 13:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1057.eqiad.wmnet with OS trixie
* 13:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1057: Upgrading es1057.eqiad.wmnet
* 12:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:56 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] (duration: 08m 30s)
* 12:52 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Continuing with deployment
* 12:50 dreamyjazz@deploy1003: mpostoronca, dreamyjazz: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2050: repool after upgrade
* 12:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296557{{!}}wmf-config: Skip CAPTCHA for action=mcrundo (T427612)]]
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 12:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1220: Migration of db1220.eqiad.wmnet completed
* 12:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1220.eqiad.wmnet with OS trixie
* 12:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2050: repool after upgrade
* 12:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1220.eqiad.wmnet with reason: host reimage
* 11:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1220.eqiad.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2050.codfw.wmnet with OS trixie
* 11:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1220: Upgrading db1220.eqiad.wmnet
* 11:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1220: Upgrading db1220.eqiad.wmnet
* 11:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1179: Migration of db1179.eqiad.wmnet completed
* 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:16 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2050.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2050.codfw.wmnet with OS trixie
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2050: Upgrading es2050.codfw.wmnet
* 10:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2057: repool after upgrade
* 10:58 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:46 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1179: Migration of db1179.eqiad.wmnet completed
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1179.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:16 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 10:15 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1179.eqiad.wmnet with reason: host reimage
* 10:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2057: repool after upgrade
* 10:13 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2057.codfw.wmnet with OS trixie
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1179.eqiad.wmnet with OS trixie
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1179: Upgrading db1179.eqiad.wmnet
* 09:58 jynus: redoing m2 backups after grant change [[phab:T411111|T411111]]
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1179: Upgrading db1179.eqiad.wmnet
* 09:56 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:53 ozge@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2057.codfw.wmnet with reason: host reimage
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Migration of db1224.eqiad.wmnet completed
* 09:38 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:37 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:36 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:35 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/kafka-ui: apply
* 09:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2057.codfw.wmnet with OS trixie
* 09:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2057: Upgrading es2057.codfw.wmnet
* 09:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2057: Upgrading es2057.codfw.wmnet
* 09:31 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:26 Dreamy_Jazz: Running `mwscript-k8s extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki="commonswiki" --use-jobqueue --poll-sleep=30 --sleep=60 --verbose`
* 09:25 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "group0.dblist + group1.dblist - mediamoderation-continuous-scan.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose`
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:54 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Migration of db1224.eqiad.wmnet completed
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Introduce pluggable authentication - oblivian@cumin1003
* 08:53 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Introduce pluggable authentication - oblivian@cumin1003"
* 08:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:29 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:24 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:21 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1224.eqiad.wmnet with OS trixie
* 08:21 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 08:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 08:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2249.codfw.wmnet with reason: upgrade
* 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
* 07:53 marostegui: Install mariadb 10.11.17 on db2249 [[phab:T427345|T427345]]
* 07:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed
* 07:34 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s)
* 07:29 kharlan@deploy1003: kharlan, harroyo-wmf: Continuing with deployment
* 07:27 kharlan@deploy1003: kharlan, harroyo-wmf: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd
* 07:25 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297536{{!}}hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200{{!}}hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173{{!}}hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]]
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed
* 07:12 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s)
* 07:08 kharlan@deploy1003: kharlan: Continuing with deployment
* 07:08 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:06 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297550{{!}}Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]]
* 07:04 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s)
* 07:03 otto@deploy1003: otto: Rolling back deployment
* 06:53 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed
* 06:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie
* 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed
* 06:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie
* 06:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage
* 06:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie
* 06:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet
* 06:12 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage
* 06:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db1255 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json
* 06:03 cwilliams@dns1004: END - running authdns-update
* 06:02 cwilliams@dns1004: START - running authdns-update
* 05:54 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json
* 05:53 cwilliams@cumin1003: dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json
* 05:53 cezmunsta: Starting x3 eqiad failover from db1255 to db1258 - [[phab:T427895|T427895]]
* 05:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie
* 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet
* 05:50 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db1258 with weight 0 [[phab:T427895|T427895]]', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:50 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T427895|T427895]]
* 05:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 05:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2191 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json
* 05:45 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2215 to x1 primary [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json
* 05:44 marostegui: Starting x1 codfw failover from db2191 to db2215 - [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 [[phab:T428120|T428120]]
* 05:27 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2215 with weight 0 [[phab:T428120|T428120]]', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json
* 05:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 03:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93829 and previous config saved to /var/cache/conftool/dbconfig/20260604-034546-fceratto.json
* 03:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93828 and previous config saved to /var/cache/conftool/dbconfig/20260604-033538-fceratto.json
* 03:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P93827 and previous config saved to /var/cache/conftool/dbconfig/20260604-032531-fceratto.json
* 03:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93826 and previous config saved to /var/cache/conftool/dbconfig/20260604-031523-fceratto.json
* 03:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93825 and previous config saved to /var/cache/conftool/dbconfig/20260604-030710-fceratto.json
* 03:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 03:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93824 and previous config saved to /var/cache/conftool/dbconfig/20260604-030642-fceratto.json
* 02:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93823 and previous config saved to /var/cache/conftool/dbconfig/20260604-025634-fceratto.json
* 02:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P93822 and previous config saved to /var/cache/conftool/dbconfig/20260604-024627-fceratto.json
* 02:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93821 and previous config saved to /var/cache/conftool/dbconfig/20260604-023619-fceratto.json
* 02:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93820 and previous config saved to /var/cache/conftool/dbconfig/20260604-022809-fceratto.json
* 02:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 02:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93819 and previous config saved to /var/cache/conftool/dbconfig/20260604-022742-fceratto.json
* 02:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93818 and previous config saved to /var/cache/conftool/dbconfig/20260604-021734-fceratto.json
* 02:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P93817 and previous config saved to /var/cache/conftool/dbconfig/20260604-020726-fceratto.json
* 01:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93816 and previous config saved to /var/cache/conftool/dbconfig/20260604-015718-fceratto.json
* 01:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93815 and previous config saved to /var/cache/conftool/dbconfig/20260604-014909-fceratto.json
* 01:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 01:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93814 and previous config saved to /var/cache/conftool/dbconfig/20260604-014841-fceratto.json
* 01:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93813 and previous config saved to /var/cache/conftool/dbconfig/20260604-013833-fceratto.json
* 01:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P93812 and previous config saved to /var/cache/conftool/dbconfig/20260604-012826-fceratto.json
* 01:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93811 and previous config saved to /var/cache/conftool/dbconfig/20260604-011818-fceratto.json
* 01:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93810 and previous config saved to /var/cache/conftool/dbconfig/20260604-011005-fceratto.json
* 01:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 01:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93809 and previous config saved to /var/cache/conftool/dbconfig/20260604-010937-fceratto.json
* 00:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93808 and previous config saved to /var/cache/conftool/dbconfig/20260604-005929-fceratto.json
* 00:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P93807 and previous config saved to /var/cache/conftool/dbconfig/20260604-004922-fceratto.json
* 00:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93806 and previous config saved to /var/cache/conftool/dbconfig/20260604-003914-fceratto.json
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93805 and previous config saved to /var/cache/conftool/dbconfig/20260604-002851-fceratto.json
* 00:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 00:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93804 and previous config saved to /var/cache/conftool/dbconfig/20260604-002821-fceratto.json
* 00:26 otto@deploy1003: otto: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1297260{{!}}EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]]
* 00:18 Amir1: mwscript-k8s --follow --dblist=all -- extensions/timeline/maintenance/DeleteOldTimelineFiles.php --date {{Gerrit|20210101000000}}
* 00:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93803 and previous config saved to /var/cache/conftool/dbconfig/20260604-001813-fceratto.json
* 00:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P93802 and previous config saved to /var/cache/conftool/dbconfig/20260604-000805-fceratto.json
== 2026-06-03 ==
* 23:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93801 and previous config saved to /var/cache/conftool/dbconfig/20260603-235758-fceratto.json
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93800 and previous config saved to /var/cache/conftool/dbconfig/20260603-234935-fceratto.json
* 23:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 23:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93799 and previous config saved to /var/cache/conftool/dbconfig/20260603-234907-fceratto.json
* 23:42 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] (duration: 07m 09s)
* 23:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93798 and previous config saved to /var/cache/conftool/dbconfig/20260603-233859-fceratto.json
* 23:37 ladsgroup@deploy1003: ladsgroup, reedy: Continuing with deployment
* 23:36 ladsgroup@deploy1003: ladsgroup, reedy: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:34 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1296561{{!}}Add a maintenance script to delete old files]], [[gerrit:1296560{{!}}Add a maintenance script to delete old files]]
* 23:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P93797 and previous config saved to /var/cache/conftool/dbconfig/20260603-232852-fceratto.json
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:22 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93796 and previous config saved to /var/cache/conftool/dbconfig/20260603-231844-fceratto.json
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93795 and previous config saved to /var/cache/conftool/dbconfig/20260603-231031-fceratto.json
* 23:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 23:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93794 and previous config saved to /var/cache/conftool/dbconfig/20260603-231001-fceratto.json
* 22:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93793 and previous config saved to /var/cache/conftool/dbconfig/20260603-225953-fceratto.json
* 22:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P93792 and previous config saved to /var/cache/conftool/dbconfig/20260603-224945-fceratto.json
* 22:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93791 and previous config saved to /var/cache/conftool/dbconfig/20260603-223937-fceratto.json
* 22:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93790 and previous config saved to /var/cache/conftool/dbconfig/20260603-223116-fceratto.json
* 22:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 22:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93789 and previous config saved to /var/cache/conftool/dbconfig/20260603-223048-fceratto.json
* 22:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93788 and previous config saved to /var/cache/conftool/dbconfig/20260603-222041-fceratto.json
* 22:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P93787 and previous config saved to /var/cache/conftool/dbconfig/20260603-221034-fceratto.json
* 22:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93786 and previous config saved to /var/cache/conftool/dbconfig/20260603-220026-fceratto.json
* 21:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93785 and previous config saved to /var/cache/conftool/dbconfig/20260603-215110-fceratto.json
* 21:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93784 and previous config saved to /var/cache/conftool/dbconfig/20260603-215053-fceratto.json
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93783 and previous config saved to /var/cache/conftool/dbconfig/20260603-214046-fceratto.json
* 21:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P93782 and previous config saved to /var/cache/conftool/dbconfig/20260603-213038-fceratto.json
* 21:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93781 and previous config saved to /var/cache/conftool/dbconfig/20260603-212030-fceratto.json
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93779 and previous config saved to /var/cache/conftool/dbconfig/20260603-211206-fceratto.json
* 21:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 21:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93778 and previous config saved to /var/cache/conftool/dbconfig/20260603-211138-fceratto.json
* 21:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93774 and previous config saved to /var/cache/conftool/dbconfig/20260603-210130-fceratto.json
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P93773 and previous config saved to /var/cache/conftool/dbconfig/20260603-205122-fceratto.json
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93772 and previous config saved to /var/cache/conftool/dbconfig/20260603-204115-fceratto.json
* 20:33 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] (duration: 06m 41s)
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93771 and previous config saved to /var/cache/conftool/dbconfig/20260603-203254-fceratto.json
* 20:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 20:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93770 and previous config saved to /var/cache/conftool/dbconfig/20260603-203227-fceratto.json
* 20:29 cjming@deploy1003: cjming: Continuing with deployment
* 20:29 cjming@deploy1003: cjming: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1297228{{!}}Attribution research don't use testKitchen compatibility layer (T417050)]]
* 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93769 and previous config saved to /var/cache/conftool/dbconfig/20260603-202219-fceratto.json
* 20:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P93766 and previous config saved to /var/cache/conftool/dbconfig/20260603-201211-fceratto.json
* 20:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93765 and previous config saved to /var/cache/conftool/dbconfig/20260603-200203-fceratto.json
* 19:59 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:59 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93764 and previous config saved to /var/cache/conftool/dbconfig/20260603-195341-fceratto.json
* 19:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 19:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93763 and previous config saved to /var/cache/conftool/dbconfig/20260603-195313-fceratto.json
* 19:47 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93762 and previous config saved to /var/cache/conftool/dbconfig/20260603-194306-fceratto.json
* 19:39 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 19:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 19:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P93761 and previous config saved to /var/cache/conftool/dbconfig/20260603-193258-fceratto.json
* 19:26 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 19:25 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 19:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93760 and previous config saved to /var/cache/conftool/dbconfig/20260603-192250-fceratto.json
* 19:22 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:22 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93759 and previous config saved to /var/cache/conftool/dbconfig/20260603-191437-fceratto.json
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 19:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93758 and previous config saved to /var/cache/conftool/dbconfig/20260603-191348-fceratto.json
* 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93757 and previous config saved to /var/cache/conftool/dbconfig/20260603-190340-fceratto.json
* 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P93756 and previous config saved to /var/cache/conftool/dbconfig/20260603-185331-fceratto.json
* 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93755 and previous config saved to /var/cache/conftool/dbconfig/20260603-184324-fceratto.json
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93754 and previous config saved to /var/cache/conftool/dbconfig/20260603-183455-fceratto.json
* 18:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93753 and previous config saved to /var/cache/conftool/dbconfig/20260603-183427-fceratto.json
* 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93752 and previous config saved to /var/cache/conftool/dbconfig/20260603-182420-fceratto.json
* 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P93751 and previous config saved to /var/cache/conftool/dbconfig/20260603-181412-fceratto.json
* 18:10 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93750 and previous config saved to /var/cache/conftool/dbconfig/20260603-180404-fceratto.json
* 17:57 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93749 and previous config saved to /var/cache/conftool/dbconfig/20260603-175544-fceratto.json
* 17:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93748 and previous config saved to /var/cache/conftool/dbconfig/20260603-175342-fceratto.json
* 17:52 hashar: contint1003: sudo puppet agent --disable "Prevent Jenkins from coming back"
* 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93747 and previous config saved to /var/cache/conftool/dbconfig/20260603-174334-fceratto.json
* 17:38 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:37 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P93746 and previous config saved to /var/cache/conftool/dbconfig/20260603-173327-fceratto.json
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 17:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host sretest2012.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93745 and previous config saved to /var/cache/conftool/dbconfig/20260603-172319-fceratto.json
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Stopping before sync operations
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify scap config change
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93744 and previous config saved to /var/cache/conftool/dbconfig/20260603-171521-fceratto.json
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93743 and previous config saved to /var/cache/conftool/dbconfig/20260603-171452-fceratto.json
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:12 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:10 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2012.wikimedia.org with OS trixie
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93742 and previous config saved to /var/cache/conftool/dbconfig/20260603-170444-fceratto.json
* 17:04 swfrench@deploy1003: Stopping before sync operations
* 17:03 swfrench@deploy1003: Started scap sync-world: No-deploy scap run to verify clean state before config change
* 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P93741 and previous config saved to /var/cache/conftool/dbconfig/20260603-165436-fceratto.json
* 16:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:53 hashar: Restarting CI Jenkins one last time # [[phab:T418521|T418521]]
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:44 btullis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] (duration: 07m 16s)
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93740 and previous config saved to /var/cache/conftool/dbconfig/20260603-164428-fceratto.json
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:43 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:40 btullis@deploy1003: btullis: Continuing with deployment
* 16:39 btullis@deploy1003: btullis: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93739 and previous config saved to /var/cache/conftool/dbconfig/20260603-163726-fceratto.json
* 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 16:37 btullis@deploy1003: Started scap sync-world: Backport for [[gerrit:1295922{{!}}Declare the webrequest.dumps.dev0 stream in EventStreamConfig (T291645 T425087)]]
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93738 and previous config saved to /var/cache/conftool/dbconfig/20260603-163658-fceratto.json
* 16:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93737 and previous config saved to /var/cache/conftool/dbconfig/20260603-162650-fceratto.json
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:19 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P93736 and previous config saved to /var/cache/conftool/dbconfig/20260603-161643-fceratto.json
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93735 and previous config saved to /var/cache/conftool/dbconfig/20260603-160635-fceratto.json
* 16:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93734 and previous config saved to /var/cache/conftool/dbconfig/20260603-155928-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93733 and previous config saved to /var/cache/conftool/dbconfig/20260603-155859-fceratto.json
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93732 and previous config saved to /var/cache/conftool/dbconfig/20260603-154852-fceratto.json
* 15:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2012.wikimedia.org with OS trixie
* 15:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:40 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/linked-artifacts: apply
* 15:40 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 15:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P93731 and previous config saved to /var/cache/conftool/dbconfig/20260603-153844-fceratto.json
* 15:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93729 and previous config saved to /var/cache/conftool/dbconfig/20260603-152836-fceratto.json
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:25 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:23 mutante: disabling jenkins on CI servers for maintenance
* 15:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host sretest2012
* 15:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host sretest2012
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93728 and previous config saved to /var/cache/conftool/dbconfig/20260603-152129-fceratto.json
* 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:21 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93727 and previous config saved to /var/cache/conftool/dbconfig/20260603-152102-fceratto.json
* 15:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding sretest2012 to codfw - jhancock@cumin2002"
* 15:18 brouberol@dns1004: END - running authdns-update
* 15:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:16 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:16 brouberol@dns1004: START - running authdns-update
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93726 and previous config saved to /var/cache/conftool/dbconfig/20260603-151055-fceratto.json
* 15:01 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P93725 and previous config saved to /var/cache/conftool/dbconfig/20260603-150047-fceratto.json
* 14:57 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 cmooney@cumin1003: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling restart_daemons on A:netbox
* 14:51 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93723 and previous config saved to /var/cache/conftool/dbconfig/20260603-145039-fceratto.json
* 14:48 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] (duration: 06m 46s)
* 14:47 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:46 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:43 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93722 and previous config saved to /var/cache/conftool/dbconfig/20260603-144334-fceratto.json
* 14:43 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 14:43 mlitn@deploy1003: mlitn: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93721 and previous config saved to /var/cache/conftool/dbconfig/20260603-144306-fceratto.json
* 14:41 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:41 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:41 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1297137{{!}}Revert "MultimediaViewer: enable image carousel as a beta feature on Wikipedias"]]
* 14:39 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:39 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:39 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:39 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:38 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 14:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 14:34 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] (duration: 10m 45s)
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93719 and previous config saved to /var/cache/conftool/dbconfig/20260603-143259-fceratto.json
* 14:30 vriley@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:28 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 14:25 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:24 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:24 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:23 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1297130{{!}}editor: make redesigned anon warning the default experience (T424595)]]
* 14:23 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P93717 and previous config saved to /var/cache/conftool/dbconfig/20260603-142251-fceratto.json
* 14:22 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:22 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:21 cmooney@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:21 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:20 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:20 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:20 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:19 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:16 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:16 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93716 and previous config saved to /var/cache/conftool/dbconfig/20260603-141242-fceratto.json
* 14:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:11 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:11 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:10 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:09 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host mc2055
* 14:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host mc2055
* 14:05 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296631{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 13m 06s)
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93715 and previous config saved to /var/cache/conftool/dbconfig/20260603-140537-fceratto.json
* 14:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93714 and previous config saved to /var/cache/conftool/dbconfig/20260603-140507-fceratto.json
* 14:01 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:58 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 dcausse@deploy1003: atsuko, dcausse: Rolling back deployment
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-133440-fceratto.json
* 13:29 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2186: Migration of db2186.codfw.wmnet completed
* 13:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] (duration: 07m 36s)
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T426633|T426633]])', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-132638-fceratto.json
* 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93710 and previous config saved to /var/cache/conftool/dbconfig/20260603-132605-fceratto.json
* 13:25 sukhe: sudo cumin 'A:lvs or A:liberica' 'disable-puppet "merging CR 1282764"'
* 13:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 13:22 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295910{{!}}hCaptcha: Roll out self-hosted secure-api.js to all wikis (T403829)]]
* 13:18 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] (duration: 07m 46s)
* 13:16 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260603-131556-fceratto.json
* 13:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:13 kharlan@deploy1003: dbrant, kharlan: Continuing with deployment
* 13:12 kharlan@deploy1003: dbrant, kharlan: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296649{{!}}hCaptcha: Roll out to all except enwiki for mobile apps. (T426048)]]
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:09 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add codfw d3 and e5 public vlans - ayounsi@cumin1003"
* 13:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P93708 and previous config saved to /var/cache/conftool/dbconfig/20260603-130548-fceratto.json
* 13:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93706 and previous config saved to /var/cache/conftool/dbconfig/20260603-125540-fceratto.json
* 12:51 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] (duration: 07m 44s)
* 12:49 jgreen@dns1004: END - running authdns-update
* 12:47 jgreen@dns1004: START - running authdns-update
* 12:46 jiji@deploy1003: jiji: Continuing with deployment
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93705 and previous config saved to /var/cache/conftool/dbconfig/20260603-124624-fceratto.json
* 12:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93704 and previous config saved to /var/cache/conftool/dbconfig/20260603-124556-fceratto.json
* 12:45 jiji@deploy1003: jiji: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2186: Migration of db2186.codfw.wmnet completed
* 12:43 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1297110{{!}}ProductionServices.php: switch filebackend.php to rdb2013:6381 (T418261 T419976)]]
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1067.eqiad.wmnet with OS bullseye
* 12:38 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] (duration: 11m 15s)
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS trixie
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93702 and previous config saved to /var/cache/conftool/dbconfig/20260603-123548-fceratto.json
* 12:34 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Continuing with deployment
* 12:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1066.eqiad.wmnet with OS bullseye
* 12:29 dreamyjazz@deploy1003: somerandomdeveloper, dreamyjazz: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:27 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292364{{!}}Update hCaptcha checks to retrieve API parameters from $_REQUEST (T427105)]]
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P93701 and previous config saved to /var/cache/conftool/dbconfig/20260603-122541-fceratto.json
* 12:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93700 and previous config saved to /var/cache/conftool/dbconfig/20260603-121533-fceratto.json
* 12:13 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 12:13 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
* 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93699 and previous config saved to /var/cache/conftool/dbconfig/20260603-120732-fceratto.json
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 12:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93698 and previous config saved to /var/cache/conftool/dbconfig/20260603-120634-fceratto.json
* 12:03 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: host reimage
* 11:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93697 and previous config saved to /var/cache/conftool/dbconfig/20260603-115626-fceratto.json
* 11:54 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS trixie
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1067
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1067
* 11:52 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1067
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1067.eqiad.wmnet 96.48.64.10.in-addr.arpa 6.9.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:52 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1067 - mvernon@cumin2002"
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2186: Upgrading db2186.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1067
* 11:46 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1067.eqiad.wmnet with OS bullseye
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P93695 and previous config saved to /var/cache/conftool/dbconfig/20260603-114618-fceratto.json
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1066
* 11:46 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1066
* 11:45 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1066
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1066.eqiad.wmnet 117.32.64.10.in-addr.arpa 7.1.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:45 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:45 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1066 - mvernon@cumin2002"
* 11:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 11:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 11:41 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1066
* 11:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1066.eqiad.wmnet with OS bullseye
* 11:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1067
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93693 and previous config saved to /var/cache/conftool/dbconfig/20260603-113611-fceratto.json
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Migration of db2196.codfw.wmnet completed
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93691 and previous config saved to /var/cache/conftool/dbconfig/20260603-112909-fceratto.json
* 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
* 11:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93690 and previous config saved to /var/cache/conftool/dbconfig/20260603-112838-fceratto.json
* 11:24 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93689 and previous config saved to /var/cache/conftool/dbconfig/20260603-111831-fceratto.json
* 11:14 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:09 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:08 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P93687 and previous config saved to /var/cache/conftool/dbconfig/20260603-110823-fceratto.json
* 11:07 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1066
* 11:07 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 11:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 11:03 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 11:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] (duration: 07m 37s)
* 11:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:59 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 10:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93685 and previous config saved to /var/cache/conftool/dbconfig/20260603-105815-fceratto.json
* 10:58 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:57 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:56 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 10:55 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 10:54 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 10:53 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 10:53 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289895{{!}}Update UserInfoCard to be enabled by default for certain user groups (T426021)]]
* 10:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93684 and previous config saved to /var/cache/conftool/dbconfig/20260603-105006-fceratto.json
* 10:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 10:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93683 and previous config saved to /var/cache/conftool/dbconfig/20260603-104939-fceratto.json
* 10:45 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:45 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Migration of db2196.codfw.wmnet completed
* 10:44 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:40 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93681 and previous config saved to /var/cache/conftool/dbconfig/20260603-103931-fceratto.json
* 10:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1053: repool after upgrade
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS trixie
* 10:36 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] (duration: 12m 03s)
* 10:32 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 10:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 10:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P93679 and previous config saved to /var/cache/conftool/dbconfig/20260603-102924-fceratto.json
* 10:26 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:24 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1297090{{!}}hCaptcha: Enable for MobileFrontend on most group1 wikis (T425940)]]
* 10:22 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1067
* 10:21 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1066
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93677 and previous config saved to /var/cache/conftool/dbconfig/20260603-101916-fceratto.json
* 10:15 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
* 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93676 and previous config saved to /var/cache/conftool/dbconfig/20260603-101105-fceratto.json
* 10:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 10:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93675 and previous config saved to /var/cache/conftool/dbconfig/20260603-101037-fceratto.json
* 10:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93673 and previous config saved to /var/cache/conftool/dbconfig/20260603-100029-fceratto.json
* 09:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS trixie
* 09:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2196: Upgrading db2196.codfw.wmnet
* 09:57 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1053: repool after upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:52 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:52 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:51 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:51 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P93670 and previous config saved to /var/cache/conftool/dbconfig/20260603-095022-fceratto.json
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:49 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:48 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1053.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2013.codfw.wmnet
* 09:41 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1053.eqiad.wmnet with reason: host reimage
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93669 and previous config saved to /var/cache/conftool/dbconfig/20260603-094014-fceratto.json
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2215: Migration of db2215.codfw.wmnet completed
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2013.codfw.wmnet
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93667 and previous config saved to /var/cache/conftool/dbconfig/20260603-093146-fceratto.json
* 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
* 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93666 and previous config saved to /var/cache/conftool/dbconfig/20260603-093119-fceratto.json
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1211: Migration of db1211.eqiad.wmnet completed
* 09:27 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] (duration: 07m 26s)
* 09:25 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1053.eqiad.wmnet with OS trixie
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add public1-b3-codfw gateway IPs - ayounsi@cumin1003"
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1053: Upgrading es1053.eqiad.wmnet
* 09:23 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1053: Upgrading es1053.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2054: repool after upgrade
* 09:21 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93661 and previous config saved to /var/cache/conftool/dbconfig/20260603-092111-fceratto.json
* 09:20 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:20 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297069{{!}}hCaptcha: Collect risk score for blocked account creations (T427784)]]
* 09:14 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] (duration: 07m 06s)
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P93659 and previous config saved to /var/cache/conftool/dbconfig/20260603-091104-fceratto.json
* 09:10 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297065{{!}}Revert^4 "hCaptcha: Load self-hosted secure-api.js on group0 wikis"]]
* 09:06 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 09:06 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 10m 54s)
* 09:05 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 09:04 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93656 and previous config saved to /var/cache/conftool/dbconfig/20260603-090056-fceratto.json
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003 - [[phab:T422043|T422043]]"
* 09:00 ayounsi@cumin1003: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 09:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new eqiad/codfw public vlans - ayounsi@cumin1003"
* 08:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1297064{{!}}Revert^3 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 11m 43s)
* 08:52 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2215: Migration of db2215.codfw.wmnet completed
* 08:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:52 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet
* 08:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb[1022-1023].eqiad.wmnet
* 08:51 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb[1022-1023].eqiad.wmnet
* 08:50 kharlan@deploy1003: kharlan: Rolling back deployment
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93652 and previous config saved to /var/cache/conftool/dbconfig/20260603-084846-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 08:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93651 and previous config saved to /var/cache/conftool/dbconfig/20260603-084819-fceratto.json
* 08:47 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS trixie
* 08:45 jiji@cumin1003: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) check docker-registry: maintenance
* 08:45 jiji@cumin1003: START - Cookbook sre.discovery.service-route check docker-registry: maintenance
* 08:43 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1211: Migration of db1211.eqiad.wmnet completed
* 08:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296635{{!}}Revert^2 "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 08:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1211.eqiad.wmnet with OS trixie
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93649 and previous config saved to /var/cache/conftool/dbconfig/20260603-083811-fceratto.json
* 08:37 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] (duration: 32m 11s)
* 08:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054: repool after upgrade
* 08:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool es2054.codfw.wmnet: After reimage
* 08:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2054.codfw.wmnet: After reimage
* 08:35 jiji@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:34 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:34 jiji@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:33 jiji@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:31 jiji@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2054.codfw.wmnet with OS trixie
* 08:30 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:29 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93647 and previous config saved to /var/cache/conftool/dbconfig/20260603-082804-fceratto.json
* 08:25 mszwarc@deploy1003: mlitn, mszwarc: Continuing with deployment
* 08:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1049: repool after upgrade
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
* 08:22 mszwarc@deploy1003: mlitn, mszwarc: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:18 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1211.eqiad.wmnet with reason: host reimage
* 08:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93645 and previous config saved to /var/cache/conftool/dbconfig/20260603-081756-fceratto.json
* 08:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:17 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:16 jiji@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2054.codfw.wmnet with reason: host reimage
* 08:05 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296632{{!}}Image Browsing: add accessible labels to carousel elements (T407793)]]
* {{safesubst:SAL entry|1=08:04 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]}}
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93643 and previous config saved to /var/cache/conftool/dbconfig/20260603-080346-fceratto.json
* 08:03 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1211.eqiad.wmnet with OS trixie
* 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS trixie
* 08:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1211: Upgrading db1211.eqiad.wmnet
* 08:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1211: Upgrading db1211.eqiad.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2215: Upgrading db2215.codfw.wmnet
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Repooling
* 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Repooling
* 08:00 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:57 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream server
* 07:57 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Continuing with deployment
* 07:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream server
* {{safesubst:SAL entry|1=07:54 mszwarc@deploy1003: anzx, mlitn, mfossati, mszwarc: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T42}}
* 07:52 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2231: repool after maintenance
* 07:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2054.codfw.wmnet with OS trixie
* 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2054: Upgrading es2054.codfw.wmnet
* 07:50 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296580{{!}}Add kha to wmgExtraLanguageNames (T427917)]], [[gerrit:1296703{{!}}jawiki: lift IP caps for workshop (T427912)]], [[gerrit:1296713{{!}}conductwiki: add sitename and logo (T426984 T427541)]], [[gerrit:1296627{{!}}Add missing lazy img to carousel (T427821)]], [[gerrit:1295968{{!}}MultimediaViewer: enable image carousel as a beta feature on Wikipedias (T426799)]]
* 07:48 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] (duration: 32m 13s)
* 07:44 marostegui@dns1004: END - running authdns-update
* 07:43 marostegui@dns1004: START - running authdns-update
* 07:42 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1056 to es2 eqiad primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93637 and previous config saved to /var/cache/conftool/dbconfig/20260603-074250-marostegui.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1049: repool after upgrade
* 07:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:35 mszwarc@deploy1003: mszwarc, stran: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc, stran: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1049.eqiad.wmnet with OS trixie
* 07:16 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1296516{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]], [[gerrit:1296517{{!}}Add a reply-to to Direct Reporting emails (T427788 T427791 T427829)]]
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1049.eqiad.wmnet with reason: host reimage
* 07:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2231: repool after maintenance
* 07:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2231.codfw.wmnet with OS trixie
* 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1049.eqiad.wmnet with OS trixie
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1049: Upgrading es1049.eqiad.wmnet
* 06:46 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2056 to es2 codfw primary [[phab:T427875|T427875]]', diff saved to https://phabricator.wikimedia.org/P93632 and previous config saved to /var/cache/conftool/dbconfig/20260603-064623-marostegui.json
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1049: Upgrading es1049.eqiad.wmnet
* 06:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1056: repool after upgrade
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2231.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2231.codfw.wmnet with OS trixie
* 06:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2231: Upgrading db2231.codfw.wmnet
* 06:09 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1056: repool after upgrade
* 05:59 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1056.eqiad.wmnet with OS trixie
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1056.eqiad.wmnet with reason: host reimage
* 05:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1056.eqiad.wmnet with OS trixie
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1056: Upgrading es1056.eqiad.wmnet
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1056: Upgrading es1056.eqiad.wmnet
* 05:16 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
== 2026-06-02 ==
* 22:21 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] (duration: 06m 27s)
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:18 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296689{{!}}hCaptcha: Correct inaccurate comment]]
* 22:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] (duration: 08m 31s)
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:10 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:09 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 22:07 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:05 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296551{{!}}hCaptcha: Enable for badlogin on group0 wikis (T426875)]]
* 20:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93621 and previous config saved to /var/cache/conftool/dbconfig/20260602-203945-fceratto.json
* 20:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93620 and previous config saved to /var/cache/conftool/dbconfig/20260602-202937-fceratto.json
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1054.eqiad.wmnet
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:27 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:26 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1054.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:20 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P93619 and previous config saved to /var/cache/conftool/dbconfig/20260602-201929-fceratto.json
* 20:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93618 and previous config saved to /var/cache/conftool/dbconfig/20260602-200922-fceratto.json
* 20:03 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1054.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1053.eqiad.wmnet
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:37 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1053.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93617 and previous config saved to /var/cache/conftool/dbconfig/20260602-190907-fceratto.json
* 19:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
* 19:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93616 and previous config saved to /var/cache/conftool/dbconfig/20260602-190811-fceratto.json
* 19:05 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.5 refs [[phab:T423914|T423914]]
* 18:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93615 and previous config saved to /var/cache/conftool/dbconfig/20260602-185804-fceratto.json
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P93614 and previous config saved to /var/cache/conftool/dbconfig/20260602-184757-fceratto.json
* 18:38 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:38 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93612 and previous config saved to /var/cache/conftool/dbconfig/20260602-183749-fceratto.json
* 18:37 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:33 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1053.eqiad.wmnet
* 18:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93611 and previous config saved to /var/cache/conftool/dbconfig/20260602-183023-fceratto.json
* 18:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93610 and previous config saved to /var/cache/conftool/dbconfig/20260602-182956-fceratto.json
* 18:27 mutante: gerrit delete unused plugin projects: barricade, WikimediaBlocks and WikimediaWebSessions
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1052.eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1052.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:25 dancy: Train is blocked at testwikis on https://phabricator.wikimedia.org/T427935
* 18:21 Daimona: Running query from [[phab:T427962|T427962]]#11978299 in x1.wikishared
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93609 and previous config saved to /var/cache/conftool/dbconfig/20260602-181949-fceratto.json
* 18:16 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] (duration: 34m 09s)
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P93608 and previous config saved to /var/cache/conftool/dbconfig/20260602-180941-fceratto.json
* 18:08 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:07 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:06 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:05 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:04 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:02 swfrench-wmf: reverting shellbox to 2026-05-20-192555 due to errors in shellbox-syntaxhighlight
* 18:02 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:01 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:01 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 18:01 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1052.eqiad.wmnet
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93607 and previous config saved to /var/cache/conftool/dbconfig/20260602-175933-fceratto.json
* 17:58 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:57 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1051.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:55 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1051.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:53 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93605 and previous config saved to /var/cache/conftool/dbconfig/20260602-175227-fceratto.json
* 17:52 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93604 and previous config saved to /var/cache/conftool/dbconfig/20260602-175157-fceratto.json
* 17:51 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:51 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:50 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:44 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:43 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:42 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:42 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93603 and previous config saved to /var/cache/conftool/dbconfig/20260602-174150-fceratto.json
* 17:41 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296615{{!}}feat(cleanMentorList): Add a feature flag (T427386)]], [[gerrit:1296614{{!}}feat(cleanMentorList): Add a feature flag (T427386)]]
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P93602 and previous config saved to /var/cache/conftool/dbconfig/20260602-173143-fceratto.json
* 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93601 and previous config saved to /var/cache/conftool/dbconfig/20260602-172135-fceratto.json
* 17:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93600 and previous config saved to /var/cache/conftool/dbconfig/20260602-171422-fceratto.json
* 17:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93599 and previous config saved to /var/cache/conftool/dbconfig/20260602-171354-fceratto.json
* 17:04 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93598 and previous config saved to /var/cache/conftool/dbconfig/20260602-170344-fceratto.json
* 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P93597 and previous config saved to /var/cache/conftool/dbconfig/20260602-165336-fceratto.json
* 16:49 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1051.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1050.eqiad.wmnet
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:48 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:47 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1050.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93596 and previous config saved to /var/cache/conftool/dbconfig/20260602-164328-fceratto.json
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93595 and previous config saved to /var/cache/conftool/dbconfig/20260602-163622-fceratto.json
* 16:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 16:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93594 and previous config saved to /var/cache/conftool/dbconfig/20260602-163550-fceratto.json
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS trixie
* 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 16:27 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2006.codfw.wmnet with OS trixie
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93593 and previous config saved to /var/cache/conftool/dbconfig/20260602-162542-fceratto.json
* 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P93591 and previous config saved to /var/cache/conftool/dbconfig/20260602-161534-fceratto.json
* 16:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS trixie
* 16:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] (duration: 06m 40s)
* 16:09 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Continuing with deployment
* 16:05 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93590 and previous config saved to /var/cache/conftool/dbconfig/20260602-160527-fceratto.json
* 16:05 jayme@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2006.codfw.wmnet with reason: host reimage
* 16:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296624{{!}}Revert "hCaptcha: Load self-hosted secure-api.js on group0 wikis" (T403829)]]
* 15:59 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] (duration: 09m 48s)
* 15:59 kharlan@deploy1003: kharlan: Rolling back deployment
* 15:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93589 and previous config saved to /var/cache/conftool/dbconfig/20260602-155817-fceratto.json
* 15:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 15:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93588 and previous config saved to /var/cache/conftool/dbconfig/20260602-155749-fceratto.json
* 15:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:53 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS trixie
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS trixie
* 15:51 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 15:49 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295909{{!}}hCaptcha: Load self-hosted secure-api.js on group0 wikis (T403829)]]
* 15:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93587 and previous config saved to /var/cache/conftool/dbconfig/20260602-154742-fceratto.json
* 15:47 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] (duration: 07m 24s)
* 15:43 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:42 kharlan@deploy1003: kharlan: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1296558{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]], [[gerrit:1296568{{!}}hCaptcha: Remove apiUrl health check and APCu layer from health checker (T421464)]]
* 15:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P93586 and previous config saved to /var/cache/conftool/dbconfig/20260602-153734-fceratto.json
* 15:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS trixie
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS trixie
* 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:31 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 15:30 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93585 and previous config saved to /var/cache/conftool/dbconfig/20260602-152726-fceratto.json
* 15:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: Repooling
* {{safesubst:SAL entry|1=15:22 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}U}}
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 15:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93583 and previous config saved to /var/cache/conftool/dbconfig/20260602-152026-fceratto.json
* 15:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93582 and previous config saved to /var/cache/conftool/dbconfig/20260602-151958-fceratto.json
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 15:18 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS trixie
* 15:18 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Continuing with deployment
* 15:18 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=15:15 dreamyjazz@deploy1003: matmarex, anzx, dreamyjazz: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582}}
* 15:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=15:13 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295502{{!}}Revert "labswiki: Disallow account autocreation"]], [[gerrit:1283106{{!}}Remove unused 'writeapi' right]], [[gerrit:1296566{{!}}Clean up bot password configuration]], [[gerrit:1296563{{!}}Remove workaround for stuck session cookies on Wikitech (T389433)]], [[gerrit:1295574{{!}}cswiki: lift IP cap for workshop on 08-June-2026 (T427678)]], [[gerrit:1296582{{!}}Us}}
* 15:12 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS trixie
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93580 and previous config saved to /var/cache/conftool/dbconfig/20260602-150951-fceratto.json
* 15:09 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]] (duration: 06m 22s)
* 15:06 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Repooling after Icing wait-for-green timeout
* 15:06 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1050.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1049.eqiad.wmnet
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:05 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1049.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:02 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1296514{{!}}[Growth] Set wgGEMentorshipCleanupEnabled to false on all wikis (T427386)]]
* 15:02 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS trixie
* 15:01 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P93578 and previous config saved to /var/cache/conftool/dbconfig/20260602-145943-fceratto.json
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:52 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1049.eqiad.wmnet
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS trixie
* 14:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:50 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93575 and previous config saved to /var/cache/conftool/dbconfig/20260602-144935-fceratto.json
* 14:42 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for pc2021.codfw.wmnet
* 14:42 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for pc2021.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2250.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2158.codfw.wmnet
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Repooling
* 14:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Repooling
* 14:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93573 and previous config saved to /var/cache/conftool/dbconfig/20260602-144110-fceratto.json
* 14:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 14:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2158: Repooling
* 14:40 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93571 and previous config saved to /var/cache/conftool/dbconfig/20260602-144043-fceratto.json
* 14:38 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:38 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver: apply
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1048.eqiad.wmnet
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:37 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS trixie
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS trixie
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93569 and previous config saved to /var/cache/conftool/dbconfig/20260602-143035-fceratto.json
* 14:30 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 14:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1048.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Repooling after Icing wait-for-green timeout
* 14:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P93566 and previous config saved to /var/cache/conftool/dbconfig/20260602-142027-fceratto.json
* 14:17 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS trixie
* 14:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1167.eqiad.wmnet
* 14:17 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1167.eqiad.wmnet
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS trixie
* 14:15 jayme@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 14:14 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93564 and previous config saved to /var/cache/conftool/dbconfig/20260602-141019-fceratto.json
* 14:09 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete --nowarn growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1048.eqiad.wmnet
* 14:08 urbanecm@deploy1003: mwscript-k8s job started: foreachwikiindblist growthexperiments userOptions.php --delete growthexperiments-homepage-variant # [[phab:T417621|T417621]]
* 14:05 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93563 and previous config saved to /var/cache/conftool/dbconfig/20260602-140140-fceratto.json
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 14:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 14:01 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS trixie
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93562 and previous config saved to /var/cache/conftool/dbconfig/20260602-140022-fceratto.json
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS trixie
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1167.eqiad.wmnet with OS trixie
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 13:51 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93561 and previous config saved to /var/cache/conftool/dbconfig/20260602-135015-fceratto.json
* 13:47 topranks: revert all config to normal on cr1-codfw and ssw1-a1-codfw
* 13:43 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS trixie
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P93560 and previous config saved to /var/cache/conftool/dbconfig/20260602-134007-fceratto.json
* 13:38 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:34 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 13:32 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 13:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1167.eqiad.wmnet with reason: host reimage
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93559 and previous config saved to /var/cache/conftool/dbconfig/20260602-132959-fceratto.json
* 13:27 slyngshede@dns1004: END - running authdns-update
* 13:25 slyngshede@dns1004: START - running authdns-update
* 13:24 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/4 towards lsw1-a5-codfw [[phab:T427301|T427301]]
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93558 and previous config saved to /var/cache/conftool/dbconfig/20260602-132314-fceratto.json
* 13:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 13:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93557 and previous config saved to /var/cache/conftool/dbconfig/20260602-132246-fceratto.json
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS trixie
* 13:19 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 13:19 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS trixie
* 13:18 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 13:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2049: repool after upgrade
* 13:17 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:16 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1167.eqiad.wmnet with OS trixie
* 13:15 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1167: Upgrading db1167.eqiad.wmnet
* 13:13 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:12 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 13:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93554 and previous config saved to /var/cache/conftool/dbconfig/20260602-131238-fceratto.json
* 13:12 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 13:12 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 13:11 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1003.eqiad.wmnet with OS trixie
* 13:07 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1002.eqiad.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS trixie
* 13:04 jayme@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2006.codfw.wmnet with OS trixie
* 13:04 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on clouddb[1022-1023].eqiad.wmnet with reason: Reimaging upstream servers
* 13:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs1001.eqiad.wmnet with OS trixie
* 13:03 topranks: increase OSPF cost on ssw1-a1-codfw et-0/0/2 towards lsw1-a3-codfw [[phab:T427301|T427301]]
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 13:02 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Reimaging upstream servers
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P93553 and previous config saved to /var/cache/conftool/dbconfig/20260602-130230-fceratto.json
* 12:59 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:57 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Migration of db2161.codfw.wmnet completed
* 12:54 topranks: shutdown sub-interfaces on cr1-codfw et-1/1/5 for row A/B vlans [[phab:T427301|T427301]]
* 12:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93550 and previous config saved to /var/cache/conftool/dbconfig/20260602-125223-fceratto.json
* 12:50 topranks: enable bgp graceful-shutdown in overlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:49 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1061.eqiad.wmnet with OS trixie
* 12:48 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:48 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt
* 12:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS trixie
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93548 and previous config saved to /var/cache/conftool/dbconfig/20260602-124541-fceratto.json
* 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93547 and previous config saved to /var/cache/conftool/dbconfig/20260602-124512-fceratto.json
* 12:43 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1060.eqiad.wmnet with OS trixie
* 12:42 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 12:42 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:42 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 12:41 topranks: enable bgp graceful-shutdown in underlay on ssw1-a1-codfw [[phab:T427301|T427301]]
* 12:35 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93545 and previous config saved to /var/cache/conftool/dbconfig/20260602-123505-fceratto.json
* 12:33 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:33 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 12:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2049: repool after upgrade
* 12:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:29 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS trixie
* 12:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2049.codfw.wmnet with OS trixie
* 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P93542 and previous config saved to /var/cache/conftool/dbconfig/20260602-122459-fceratto.json
* 12:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS trixie
* 12:21 XioNoX: reboot lsw1-a3-codfw for software upgrade - [[phab:T427301|T427301]]
* 12:20 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS trixie
* 12:20 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS trixie
* 12:17 jayme@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 12:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] (duration: 09m 02s)
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93539 and previous config saved to /var/cache/conftool/dbconfig/20260602-121451-fceratto.json
* 12:11 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:11 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lsw1-a3-codfw,lsw1-a3-codfw IPv6,lsw1-a3-codfw.mgmt with reason: Switch maintenance
* 12:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2161: Migration of db2161.codfw.wmnet completed
* 12:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Switch maintenance
* 12:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93537 and previous config saved to /var/cache/conftool/dbconfig/20260602-120755-fceratto.json
* 12:07 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93536 and previous config saved to /var/cache/conftool/dbconfig/20260602-120728-fceratto.json
* 12:07 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2011,2033-2034,2050,2055-2062,2068-2071,2107-2113].codfw.wmnet
* 12:07 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1296532{{!}}hCaptcha: Deduplicate edit API detection code (T427887)]], [[gerrit:1296533{{!}}hCaptcha: Disable hCaptcha for DiscussionTools for the apps (T427887)]]
* 12:05 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2049.codfw.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 12:02 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 12:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 11:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93535 and previous config saved to /var/cache/conftool/dbconfig/20260602-115721-fceratto.json
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:55 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:55 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:53 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:53 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:50 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS trixie
* 11:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS trixie
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2049.codfw.wmnet with OS trixie
* 11:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2049: Upgrading es2049.codfw.wmnet
* 11:48 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2049: Upgrading es2049.codfw.wmnet
* 11:47 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:47 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS trixie
* 11:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2056: repool after upgrade
* 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P93532 and previous config saved to /var/cache/conftool/dbconfig/20260602-114713-fceratto.json
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS trixie
* 11:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2161.codfw.wmnet with reason: host reimage
* 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93531 and previous config saved to /var/cache/conftool/dbconfig/20260602-113705-fceratto.json
* 11:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93529 and previous config saved to /var/cache/conftool/dbconfig/20260602-113019-fceratto.json
* 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1161: Repooling
* 11:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1161: Repooling
* 11:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS trixie
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2161: Upgrading db2161.codfw.wmnet
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 11:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93527 and previous config saved to /var/cache/conftool/dbconfig/20260602-111954-fceratto.json
* 11:15 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2161 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93525 and previous config saved to /var/cache/conftool/dbconfig/20260602-111511-cwilliams.json
* 11:12 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2165 to s8 primary [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93524 and previous config saved to /var/cache/conftool/dbconfig/20260602-111200-cwilliams.json
* 11:10 cezmunsta: Starting s8 codfw failover from db2161 to db2165 - [[phab:T427892|T427892]]
* 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P93523 and previous config saved to /var/cache/conftool/dbconfig/20260602-110947-fceratto.json
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS trixie
* 11:04 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2165 with weight 0 [[phab:T427892|T427892]]', diff saved to https://phabricator.wikimedia.org/P93522 and previous config saved to /var/cache/conftool/dbconfig/20260602-110420-cwilliams.json
* 11:03 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T427892|T427892]]
* 11:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2056: repool after upgrade
* 11:01 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93520 and previous config saved to /var/cache/conftool/dbconfig/20260602-105939-fceratto.json
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93519 and previous config saved to /var/cache/conftool/dbconfig/20260602-105239-fceratto.json
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93518 and previous config saved to /var/cache/conftool/dbconfig/20260602-105202-fceratto.json
* 10:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2056.codfw.wmnet with OS trixie
* 10:42 moritzm: installing busybox security updates
* 10:42 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93517 and previous config saved to /var/cache/conftool/dbconfig/20260602-104154-fceratto.json
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P93516 and previous config saved to /var/cache/conftool/dbconfig/20260602-103146-fceratto.json
* 10:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:27 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 10:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2056.codfw.wmnet with reason: host reimage
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93515 and previous config saved to /var/cache/conftool/dbconfig/20260602-102139-fceratto.json
* 10:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2056.codfw.wmnet with OS trixie
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2056: Upgrading es2056.codfw.wmnet
* 10:08 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:06 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 09:56 claime: Enabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:46 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on cumin2003.codfw.wmnet with reason: in setup
* 09:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: Pooling
* 09:37 claime: Running puppet on cp6010 and cp6011 - [[phab:T422937|T422937]]
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow2004.codfw.wmnet to plain
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93511 and previous config saved to /var/cache/conftool/dbconfig/20260602-093716-fceratto.json
* 09:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow2004.codfw.wmnet to plain
* 09:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of rpki2003.codfw.wmnet to plain
* 09:34 claime: Disabling puppet on A:cp-text for ATS rest-gateway cleanup - [[phab:T422937|T422937]]
* 09:34 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of rpki2003.codfw.wmnet to plain
* 09:32 moritzm: temporarily remove ganeti2045 from the codfw cluster [[phab:T427357|T427357]]
* 09:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS trixie
* 09:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1187: Pooling
* 09:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93508 and previous config saved to /var/cache/conftool/dbconfig/20260602-091126-fceratto.json
* 09:09 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1187 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93506 and previous config saved to /var/cache/conftool/dbconfig/20260602-090432-fceratto.json
* 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2250.codfw.wmnet with reason: rack A3 maintenance
* 08:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 08:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:54 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:53 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:51 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 08:41 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 08:37 urbanecm: Reset user email of Barras@votewiki to the one of Barras@SUL
* 08:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93505 and previous config saved to /var/cache/conftool/dbconfig/20260602-083033-fceratto.json
* 08:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 08:29 slyngs: IDP, new configuration in preparation for webauthn
* 08:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93504 and previous config saved to /var/cache/conftool/dbconfig/20260602-082026-fceratto.json
* 08:19 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:18 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:17 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] (duration: 03m 33s)
* 08:16 atsuko@deploy1003: atsuko: Rolling back deployment
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2053: repool after upgrade
* 08:15 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:13 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296488{{!}}Revert "translate: adding separate read/write endpoints" (T425377)]]
* 08:11 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 marostegui: Install mariadb 10.11.17 on es2053 [[phab:T427345|T427345]]
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P93502 and previous config saved to /var/cache/conftool/dbconfig/20260602-081018-fceratto.json
* 08:09 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Depool for rack maintenance
* 08:03 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] (duration: 14m 47s)
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93499 and previous config saved to /var/cache/conftool/dbconfig/20260602-080011-fceratto.json
* 07:59 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:59 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:58 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1181 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93498 and previous config saved to /var/cache/conftool/dbconfig/20260602-075759-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 07:50 atsuko@deploy1003: atsuko: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:49 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1296262{{!}}translate: fixing missed variable in credentials formatting closure (T425377)]]
* 07:48 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1181: Pooling
* 07:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Pooling
* 07:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Reboot
* 07:43 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Reboot
* 07:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1181.eqiad.wmnet with reason: Reboot
* 07:41 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1181: Migration of db1181.eqiad.wmnet completed
* 07:40 atsuko@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] (duration: 21m 01s)
* 07:39 atsuko@deploy1003: atsuko: Rolling back deployment
* 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93490 and previous config saved to /var/cache/conftool/dbconfig/20260602-073904-fceratto.json
* 07:32 XioNoX: pfw1-eqiad# delete protocols bgp group Production family inet6 - [[phab:T423384|T423384]]
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2053: repool after upgrade
* 07:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2158.codfw.wmnet with reason: rack A3 maintenance
* 07:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93487 and previous config saved to /var/cache/conftool/dbconfig/20260602-072856-fceratto.json
* 07:28 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: rack A3 maintenance
* 07:28 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2158: rack A3 maintenance
* 07:27 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on pc2021.codfw.wmnet with reason: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2021: rack A3 maintenance
* 07:26 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:25 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2021: rack A3 maintenance
* 07:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Depool for rack maintenance
* 07:23 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2241.codfw.wmnet
* 07:23 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2241.codfw.wmnet
* 07:21 atsuko@deploy1003: atsuko: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2053.codfw.wmnet with OS trixie
* 07:19 atsuko@deploy1003: Started scap sync-world: Backport for [[gerrit:1294949{{!}}translate: adding separate read/write endpoints (T425377)]]
* 07:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2241.codfw.wmnet with reason: Depool for rack maintenance
* 07:14 marostegui: Install mariadb 10.11.17 on db2186 [[phab:T427345|T427345]]
* 07:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Depool for rack maintenance
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2186.codfw.wmnet with reason: upgrade
* 07:12 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Depool for rack maintenance
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2053.codfw.wmnet with reason: host reimage
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93478 and previous config saved to /var/cache/conftool/dbconfig/20260602-065533-fceratto.json
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1181: Migration of db1181.eqiad.wmnet completed
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS trixie
* 06:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2053.codfw.wmnet with OS trixie
* 06:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2053: Upgrading es2053.codfw.wmnet
* 06:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 06:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1052: repool after upgrade
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
* 06:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 06:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 06:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 06:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS trixie
* 06:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1181: Upgrading db1181.eqiad.wmnet
* 06:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1181: Upgrading db1181.eqiad.wmnet
* 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:02 marostegui@dns1004: END - running authdns-update
* 06:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1181 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93473 and previous config saved to /var/cache/conftool/dbconfig/20260602-060157-marostegui.json
* 06:01 marostegui@dns1004: START - running authdns-update
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1236 to s7 primary and set section read-write [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93472 and previous config saved to /var/cache/conftool/dbconfig/20260602-060041-marostegui.json
* 06:00 marostegui@cumin1003: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93471 and previous config saved to /var/cache/conftool/dbconfig/20260602-060018-marostegui.json
* 06:00 marostegui: Starting s7 eqiad failover from db1181 to db1236 - [[phab:T426088|T426088]]
* 05:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1236 with weight 0 [[phab:T426088|T426088]]', diff saved to https://phabricator.wikimedia.org/P93470 and previous config saved to /var/cache/conftool/dbconfig/20260602-055153-marostegui.json
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 [[phab:T426088|T426088]]
* 05:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1052: repool after upgrade
* 05:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 05:47 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:45 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1052.eqiad.wmnet with OS trixie
* 05:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:29 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:26 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:25 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1052.eqiad.wmnet with reason: host reimage
* 05:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 05:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1052.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1052: Upgrading es1052.eqiad.wmnet
* 05:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1052: Upgrading es1052.eqiad.wmnet
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 05:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 04:49 ryankemper: [[phab:T425007|T425007]] (k8s) created 4 wdqs namespaces on `dse-k8s-codfw`'s `admin_ng` ns: `wdqs-[internal,external]` & `wdqs-[internal,external]-next`; certs issued
* 04:46 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 04:40 ryankemper@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 04:36 ryankemper@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 04:05 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.2 (duration: 05m 33s)
== 2026-06-01 ==
* 23:27 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] (duration: 07m 17s)
* 23:23 jdlrobson@deploy1003: mfossati, jdlrobson: Continuing with deployment
* 23:22 jdlrobson@deploy1003: mfossati, jdlrobson: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295963{{!}}Make MultimediaViewer compatible with MobileFrontend legacy parser (T427542)]], [[gerrit:1295962{{!}}Carousel: Defer to MobileFrontend lightbox on mobile (T427679)]]
* 23:15 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] (duration: 09m 33s)
* 23:11 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:07 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:06 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1296022{{!}}Donor Delight Badge: Add dependency on mw.user (T427850)]], [[gerrit:1296028{{!}}styles: Limit selector to badge client pref (T427407)]]
* 23:04 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 22:36 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] (duration: 06m 22s)
* 22:32 reedy@deploy1003: reedy: Continuing with deployment
* 22:31 reedy@deploy1003: reedy: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:30 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1296024{{!}}Add maintenance script to scrape SVG render files]]
* 22:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 22:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 22:00 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:58 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:56 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 21:51 sbassett: Deployed updated mitigation for [[phab:T326691|T326691]]
* 21:50 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 21:35 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 21:35 maryum: Deployed security fix for [[phab:T427611|T427611]]
* 21:35 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 21:33 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:32 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:27 maryum: Deployed security fix for [[phab:T427235|T427235]]
* 21:13 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] (duration: 09m 20s)
* 21:09 catrope@deploy1003: catrope, arlolra: Continuing with deployment
* 21:09 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 21:09 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 21:08 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 21:07 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 21:06 catrope@deploy1003: catrope, arlolra: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:04 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1296002{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T353697 T415591 T427565)]], [[gerrit:1296003{{!}}Bump wikimedia/parsoid to 0.24.0-a7 (T427565)]], [[gerrit:1296009{{!}}Redirect Special:AccountRecovery to the shared domain (T427692)]]
* 20:53 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 20:37 ryankemper@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: [[phab:T427852|T427852]] hw failure
* 20:26 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] (duration: 07m 48s)
* 20:22 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Continuing with deployment
* 20:20 catrope@deploy1003: sfaci, xxblackburnxx, catrope: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1285412{{!}}Remove `wgTestKitchenExperimentStreamNames` (T422358)]], [[gerrit:1295531{{!}}Enable AbuseFilter block action on nlwiki (T427384)]]
* 20:12 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] (duration: 07m 37s)
* 20:08 catrope@deploy1003: catrope: Continuing with deployment
* 20:07 catrope@deploy1003: catrope: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1295504{{!}}passwordlessLogin: Don't immediately error out in unsupported browsers (T427562)]]
* 19:48 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 19:47 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 19:46 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 19:45 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 19:01 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 19:00 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 18:24 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] (duration: 06m 42s)
* 18:20 otto@deploy1003: otto: Continuing with deployment
* 18:19 otto@deploy1003: otto: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:17 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1295950{{!}}mediawiki.user_change.dev0 - key by user.wiki_id (T426198)]]
* 18:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 18:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 18:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:02 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:02 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to plain
* 18:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 18:01 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 18:01 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to plain
* 17:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 17:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 17:53 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-main2006.codfw.wmnet with OS trixie
* 17:42 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] (duration: 07m 29s)
* 17:37 samtar@deploy1003: chlod, samtar: Continuing with deployment
* 17:36 samtar@deploy1003: chlod, samtar: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:34 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1295976{{!}}nlwiki: change to Wikipedia 25 logo (T424519)]]
* 17:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Update
* 17:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 17:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:04 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:04 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 17:03 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: Pooling
* 17:03 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1180: Pooling
* 16:59 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd2001.codfw.wmnet to drbd
* 16:58 Amir1: drop flaggedrevs tables on wikinews wikis ([[phab:T423577|T423577]])
* 16:57 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-main2006.codfw.wmnet with OS trixie
* 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93462 and previous config saved to /var/cache/conftool/dbconfig/20260601-165717-fceratto.json
* 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93460 and previous config saved to /var/cache/conftool/dbconfig/20260601-164709-fceratto.json
* 16:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 16:37 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs-main,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93458 and previous config saved to /var/cache/conftool/dbconfig/20260601-163701-fceratto.json
* 16:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:35 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:35 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:35 ryankemper@cumin2002: conftool action : set/pooled=no; selector: dc=eqiad,cluster=wdqs,service=wdqs-main,name=wdqs1015.eqiad.wmnet
* 16:34 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:34 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:31 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:30 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1236.eqiad.wmnet
* 16:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1236: Update
* 16:29 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Update
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93455 and previous config saved to /var/cache/conftool/dbconfig/20260601-162653-fceratto.json
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Migration of db1209.eqiad.wmnet completed
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1236.eqiad.wmnet with reason: Kernel update [[phab:T426633|T426633]]
* 16:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Update
* 16:09 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Update
* 16:08 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 16:07 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 16:06 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:05 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd2003.codfw.wmnet to drbd
* 16:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 16:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 16:02 moritzm: temporarily remove ganeti2027 from the codfw cluster [[phab:T427357|T427357]]
* 15:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:56 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db1224: Pooling
* 15:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2005.codfw.wmnet with OS bullseye
* 15:53 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1224: Pooling
* 15:51 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 15:49 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 15:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1224: Pooling
* 15:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:40 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:40 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:39 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:39 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:39 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Migration of db1209.eqiad.wmnet completed
* 15:39 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 15:38 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:38 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2005.codfw.wmnet with reason: host reimage
* 15:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 15:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1209.eqiad.wmnet with OS trixie
* 15:28 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] (duration: 06m 15s)
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93446 and previous config saved to /var/cache/conftool/dbconfig/20260601-152638-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1224.eqiad.wmnet
* 15:25 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1224: Pooling
* 15:25 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1224: Pooling
* 15:24 kharlan@deploy1003: kharlan: Continuing with deployment
* 15:24 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2005.codfw.wmnet with OS bullseye
* 15:22 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295802{{!}}hCaptcha: Raise SiteVerify error threshold to 100]]
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:22 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] (duration: 08m 24s)
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:16 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:14 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1295946{{!}}hCaptcha: Enable for VisualEditor on all WMF wikis (T425940)]]
* 15:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1209.eqiad.wmnet with reason: host reimage
* 15:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93445 and previous config saved to /var/cache/conftool/dbconfig/20260601-151024-fceratto.json
* 15:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:sessionstore
* 15:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93443 and previous config saved to /var/cache/conftool/dbconfig/20260601-150017-fceratto.json
* 14:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1209.eqiad.wmnet with OS trixie
* 14:52 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:52 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1209: Upgrading db1209.eqiad.wmnet
* 14:52 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 14:51 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:51 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 14:50 atsuko@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 14:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93441 and previous config saved to /var/cache/conftool/dbconfig/20260601-145010-fceratto.json
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 atsuko@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:48 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:42 atsuko@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:41 atsuko@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93440 and previous config saved to /var/cache/conftool/dbconfig/20260601-144002-fceratto.json
* 14:37 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:30 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:30 ladsgroup@deploy1003: Synchronized portals: Deploy portals ([[phab:T421797|T421797]]) (duration: 02m 43s)
* 14:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:27 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Deploy portals ([[phab:T421797|T421797]]) (duration: 06m 10s)
* 14:25 sukhe@dns1004: END - running authdns-update
* 14:23 sukhe@dns1004: START - running authdns-update
* 14:22 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:16 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:12 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:11 Lucas_WMDE: UTC afternoon backport+config window done
* 14:10 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] (duration: 11m 06s)
* 14:06 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 14:05 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 14:03 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:02 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 14:01 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:sessionstore
* 13:58 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1295918{{!}}Remove sfsblock-bypass from the IP block exemption user group on all wikis (T427745)]]
* 13:52 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 13:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS trixie
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93439 and previous config saved to /var/cache/conftool/dbconfig/20260601-133947-fceratto.json
* 13:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 13:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:35 atsukoito: restarted pybal.service on lvs2013
* 13:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 13:31 atsukoito: restarted pybal.service on lvs2014
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:22 atsukoito: restarted pybal.service on lvs1019
* 13:22 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:21 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:20 atsukoito: restarted pybal.service on lvs1020
* 13:20 Msz2001: UTC afternoon backpot+config window done
* 13:20 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] (duration: 06m 22s)
* 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test2001.codfw.wmnet
* 13:18 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS trixie
* 13:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-wdqs-test1001.eqiad.wmnet
* 13:16 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:15 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 atsukoito: sudo cumin 'A:lvs-low-traffic-eqiad' 'systemctl restart pybal.service'
* 13:14 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295875{{!}}Add SetGlobalPreference maintenance script (T427476)]]
* 13:12 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] (duration: 10m 06s)
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93438 and previous config saved to /var/cache/conftool/dbconfig/20260601-130949-fceratto.json
* 13:08 mszwarc@deploy1003: codenamenoreste, mszwarc: Continuing with deployment
* 13:07 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:06 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:05 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:04 mszwarc@deploy1003: codenamenoreste, mszwarc: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:03 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:02 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1295536{{!}}swwiki: Enable the Visual Editor on the project namespace (T427117)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93437 and previous config saved to /var/cache/conftool/dbconfig/20260601-125941-fceratto.json
* 12:56 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=inference,name=eqiad
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:55 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:52 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:50 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:49 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P93436 and previous config saved to /var/cache/conftool/dbconfig/20260601-124934-fceratto.json
* 12:48 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93435 and previous config saved to /var/cache/conftool/dbconfig/20260601-123926-fceratto.json
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:29 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:28 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:20 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:17 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:15 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:15 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:11 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=inference,name=eqiad
* 12:07 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to drbd
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2027.codfw.wmnet
* 12:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 11:59 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:59 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93434 and previous config saved to /var/cache/conftool/dbconfig/20260601-113911-fceratto.json
* 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
* 11:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93433 and previous config saved to /var/cache/conftool/dbconfig/20260601-113843-fceratto.json
* 11:37 moritzm: installing Exim security updates
* 11:36 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:34 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:33 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:32 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:28 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93432 and previous config saved to /var/cache/conftool/dbconfig/20260601-112835-fceratto.json
* 11:25 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:23 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 moritzm: installing imagemagick security updates
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:22 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:22 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:21 trueg@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs: apply
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93430 and previous config saved to /var/cache/conftool/dbconfig/20260601-111827-fceratto.json
* 11:17 trueg@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs: apply
* 11:14 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 11:12 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 11:10 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93429 and previous config saved to /var/cache/conftool/dbconfig/20260601-110820-fceratto.json
* 11:04 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1055: repool after upgrade
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93427 and previous config saved to /var/cache/conftool/dbconfig/20260601-110121-fceratto.json
* 11:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 10:54 marostegui@dns1004: END - running authdns-update
* 10:52 marostegui@dns1004: START - running authdns-update
* 10:48 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es1050 to es1 eqiad primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93425 and previous config saved to /var/cache/conftool/dbconfig/20260601-104837-marostegui.json
* 10:47 marostegui@cumin1003: dbctl commit (dc=all): 'Promote es2055 to es1 codfw primary [[phab:T427032|T427032]]', diff saved to https://phabricator.wikimedia.org/P93424 and previous config saved to /var/cache/conftool/dbconfig/20260601-104739-marostegui.json
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1177: Migration of db1177.eqiad.wmnet completed
* 10:40 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 10:34 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93421 and previous config saved to /var/cache/conftool/dbconfig/20260601-103316-fceratto.json
* 10:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93418 and previous config saved to /var/cache/conftool/dbconfig/20260601-102308-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1055: repool after upgrade
* 10:15 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1055.eqiad.wmnet with OS trixie
* 10:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P93415 and previous config saved to /var/cache/conftool/dbconfig/20260601-101300-fceratto.json
* 10:09 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:07 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93414 and previous config saved to /var/cache/conftool/dbconfig/20260601-100252-fceratto.json
* 10:00 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1177: Migration of db1177.eqiad.wmnet completed
* 09:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:56 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 09:54 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 09:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1055.eqiad.wmnet with reason: host reimage
* 09:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS trixie
* 09:51 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:50 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1055.eqiad.wmnet with OS trixie
* 09:38 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1055: Upgrading es1055.eqiad.wmnet
* 09:38 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1055: Upgrading es1055.eqiad.wmnet
* 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:31 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
* 09:17 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS trixie
* 09:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:14 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:13 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1177: Upgrading db1177.eqiad.wmnet
* 09:11 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93410 and previous config saved to /var/cache/conftool/dbconfig/20260601-090237-fceratto.json
* 09:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93409 and previous config saved to /var/cache/conftool/dbconfig/20260601-090209-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93408 and previous config saved to /var/cache/conftool/dbconfig/20260601-085202-fceratto.json
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P93407 and previous config saved to /var/cache/conftool/dbconfig/20260601-084154-fceratto.json
* 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93406 and previous config saved to /var/cache/conftool/dbconfig/20260601-083146-fceratto.json
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93405 and previous config saved to /var/cache/conftool/dbconfig/20260601-082442-fceratto.json
* 08:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
* 07:58 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] (duration: 11m 26s)
* 07:56 XioNoX: add no_p2p term to pfw1-codfw BGP_fundraising_export - [[phab:T423384|T423384]]
* 07:52 wmde-fisch@deploy1003: lilients, wmde-fisch: Continuing with deployment
* 07:51 wmde-fisch@deploy1003: lilients, wmde-fisch: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:47 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1295454{{!}}Disable the creation of synthetic main refs in production (T427484)]]
* 07:45 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] (duration: 31m 34s)
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:38 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 07:32 wmde-fisch@deploy1003: wmde-fisch: Continuing with deployment
* 07:31 wmde-fisch@deploy1003: wmde-fisch: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 07:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294826{{!}}Update VE core submodule to master (9cf5524e7) (T424232)]]
* 06:48 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 06:47 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
31wlir0ybu2qa12qxyonysiagb54kv3
Nova Resource:Tools.scholia/SAL
498
244659
2428867
2231363
2026-06-21T09:18:53Z
Stashbot
7414
dhinus: webservice stop (tried restarting, but it keeps crashing) T429738
2428867
wikitext
text/x-wiki
=== 2026-06-21 ===
* 09:18 dhinus: webservice stop (tried restarting, but it keeps crashing) [[phab:T429738|T429738]]
=== 2024-09-30 ===
* 16:17 dcaro: truncadet uwsgi.log to 100M (was ~20G)
=== 2020-02-29 ===
* 16:41 bstorm_: stopping and starting the app on the new cluster
* 16:40 bstorm_: cleaned up orphaned replicaset on the old cluster.
=== 2020-02-28 ===
* 16:21 wm-bot: <root> Migrated to 2020 Kubernetes cluster
=== 2020-02-26 ===
* 19:58 wm-bot: <root> Reverted to legacy Kubernetes cluster
* 16:54 wm-bot: <root> Migrated to 2020 Kubernetes cluster
=== 2016-10-10 ===
* 17:38 bd808: Force deleted (`sudo qdel -f ...`) three webservice jobs that were stuck in "deleting" state
<noinclude>[[Category:SAL]]</noinclude>
ikknkh3zv4l4er47bfngdt4u8pj8cmu
Map of database maintenance
0
449160
2428863
2428849
2026-06-21T00:00:09Z
Dexbot
30554
Bot: Updating the report
2428863
wikitext
text/x-wiki
{{/Header}}
== Today (2026-06-21) ==
== Yesterday (2026-06-20) ==
== Last seven days ==
{| class="wikitable"
|+ eqiad
|-
! Section !! Work
|-
| es6 ||
* [[phab:T429118|Migrate es6 section to Debian Trixie (T429118)]] (marostegui)
* [[phab:T429436|Switchover es6 master (es1038 -> es1037) (T429436)]] (marostegui)
|-
| s1 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
|}
{| class="wikitable"
|+ codfw
|-
! Section !! Work
|-
| es5 || [[phab:T428572|Migrate es5 section to Debian Trixie (T428572)]] (marostegui)
|-
| es6 || [[phab:T429303|Switchover es6 master (es2035 -> es2037) (T429303)]] (marostegui)
|-
| es7 || [[phab:T429463|Migrate es7 section to Debian Trixie (T429463)]] (marostegui)
|-
| s1 || [[phab:T429190|Switchover s1 master (db2203 -> db2212) (T429190)]] (cwilliams)
|-
|}
[[Category:MariaDB]]
mxkf3x0o6l1wiqnr32c542wwzh7d9c4
Nova Resource:Tools.cluebotng-monitoring/SAL
498
459243
2428856
2427480
2026-06-20T12:18:13Z
Stashbot
7414
wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27870870297 (https://github.com/cluebotng/component-configs/commits/331545fc215d0e72512d841be6ac751372a6c229)
2428856
wikitext
text/x-wiki
=== 2026-06-20 ===
* 12:18 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27870870297 (https://github.com/cluebotng/component-configs/commits/331545fc215d0e72512d841be6ac751372a6c229)
=== 2026-06-16 ===
* 12:51 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27618465342 (https://github.com/cluebotng/component-configs/commits/9b3fcbbd479041d764ab53f0a74027e2df6df4f6)
* 12:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27617439342 (https://github.com/cluebotng/component-configs/commits/25d55050b7cd85801d40ba66cf804d6d4a3b51bd)
* 12:30 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27617061512 (https://github.com/cluebotng/component-configs/commits/4505fa21f50c25710eaf644eecdc0233f726c946)
* 12:23 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27616804749 (https://github.com/cluebotng/component-configs/commits/af2bc530f42e1d932c23f8bea9c8d3686c343d10)
* 12:20 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27616668922 (https://github.com/cluebotng/component-configs/commits/1bfb0487d9d8b3b799e0d9a30269ca314046feba)
* 12:02 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27615852687 (https://github.com/cluebotng/component-configs/commits/c9746412542e654e3ae7337a570727eb5d7195d6)
* 11:57 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/27615771202 (https://github.com/cluebotng/component-configs/commits/38c93594afd6d57d84fde7b3d927c1c73e0a18f9)
=== 2026-06-15 ===
* 16:24 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27560286658 (https://github.com/cluebotng/component-configs/commits/5c1faa1ebe1b269c9d3e13fc4d7c7a9fddf23bf2)
* 15:31 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27557053203 (https://github.com/cluebotng/component-configs/commits/80c3b871693d13296e1c4640105e4a8cabc9d5a4)
* 15:26 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27556781475 (https://github.com/cluebotng/component-configs/commits/81d6af2f5fa912449070fe6ac104761023a9960b)
* 14:52 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/27554700598 (https://github.com/cluebotng/component-configs/commits/12298f8c7711b0dbc3ebe3196da055b62b307301)
=== 2026-06-10 ===
* 14:56 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/27285126180 (https://github.com/cluebotng/component-configs/commits/3a4f641c7199ec2c34cd294d0baf97b9be997e7b)
* 12:57 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/27277788903 (https://github.com/cluebotng/component-configs/commits/39ecf0765b86afbcbd1be02c9f9a5519245ab884)
=== 2026-05-31 ===
* 17:53 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26719865829 (https://github.com/cluebotng/component-configs/commits/ea5cbd0bd54e33719b9e05b40b425784b205838f)
* 17:51 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26719857015 (https://github.com/cluebotng/component-configs/commits/017ef6a67c2650b45476bc9d5021e7abec570d4f)
* 17:43 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26719584366 (https://github.com/cluebotng/component-configs/commits/db6b6fb5bb2d48eb49c6ceb67565398f3d58dcac)
* 17:38 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26719577987 (https://github.com/cluebotng/component-configs/commits/9d18fa5f3d4d6f4cdcec776c648bb2d93e2ba652)
=== 2026-05-29 ===
* 02:31 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26614336971 (https://github.com/cluebotng/component-configs/commits/622f302a735e2a285b7ac4f46e726aaf538f8d93)
* 00:54 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26611217389 (https://github.com/cluebotng/component-configs/commits/1c99ebab25e9cbf3bd595f45368b3a5e15489941)
=== 2026-05-28 ===
* 23:58 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26609327945 (https://github.com/cluebotng/component-configs/commits/3772f3cb5ead5bf2f5baf68cb048123a3b98c006)
* 23:15 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26607745901 (https://github.com/cluebotng/component-configs/commits/4cb42c1d89330943cefe6c3f647fe33e1760d090)
* 21:36 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26603560881 (https://github.com/cluebotng/component-configs/commits/c78caa8aac55b55f6da890066aba0130382c840f)
* 19:48 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26598011849 (https://github.com/cluebotng/component-configs/commits/0c29ead3e5562572c27b58851d98362fc1222a01)
* 19:03 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26595835360 (https://github.com/cluebotng/component-configs/commits/b6254ea98d0e678578264c75442834eb5f9b6421)
* 16:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26587853908 (https://github.com/cluebotng/component-configs/commits/e6b9e2d0e00d40589e0bebda0bbbffdc178bea3a)
* 11:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26571205214 (https://github.com/cluebotng/component-configs/commits/8f023304162d9582d5736568d9d7078c10b9041d)
* 10:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26568183905 (https://github.com/cluebotng/component-configs/commits/21b5bddbae9516620318fc278d5c3d1e3f295511)
=== 2026-05-21 ===
* 22:37 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26256956868 (https://github.com/cluebotng/component-configs/commits/95f311a49172258bfd4d1c00d74a918a7aceab68)
* 22:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26256861316 (https://github.com/cluebotng/component-configs/commits/e4c68154f62c8846320103abb0e97af63b20f9b1)
* 21:48 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26254813158 (https://github.com/cluebotng/component-configs/commits/ebf46d18f22eee769feeeffd6f4603ec4f320bbb)
* 21:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26254787328 (https://github.com/cluebotng/component-configs/commits/80d9748516b3523cecfc76bd85d662fe91dde0c4)
=== 2026-05-19 ===
* 02:37 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26072649347 (https://github.com/cluebotng/component-configs/commits/94b7da488f5c23df52fa2e750438a3a80f589f3c)
* 02:08 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26071765151 (https://github.com/cluebotng/component-configs/commits/5184d69b34b8d2442df1fbbd652535f5890b4804)
* 01:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26070598819 (https://github.com/cluebotng/component-configs/commits/c5378a5646858d02f07ababdb2db2955000ab187)
=== 2026-05-18 ===
* 23:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26066535621 (https://github.com/cluebotng/component-configs/commits/2b0b0bbd4e0a727a468343823d4343fe7a082ef9)
* 23:18 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26065893095 (https://github.com/cluebotng/component-configs/commits/0bf6e1847388fd90f2c03acdc1b80d388cb437e7)
=== 2026-05-17 ===
* 08:50 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25986256508 (https://github.com/cluebotng/component-configs/commits/fcc6e35d16a8ccd601ed7f37bbafcc9b26b8f80f)
* 01:10 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25977577391 (https://github.com/cluebotng/component-configs/commits/b7ea6acca0022ec902075282f7fd686162948bc8)
* 01:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25977567974 (https://github.com/cluebotng/component-configs/commits/7f853bffd99230f195cb0bb08b2a29efcd929b17)
* 00:54 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25977310961 (https://github.com/cluebotng/component-configs/commits/febede844a3f1ef2246a9529152904e5d3f07272)
* 00:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25976823874 (https://github.com/cluebotng/component-configs/commits/e37f79c584a138b7783dc854dbf40e6338905e83)
=== 2026-05-14 ===
* 21:52 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25887641793 (https://github.com/cluebotng/component-configs/commits/a533fb138a23d96ee62535463d21714a7e24fe18)
* 21:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25887423820 (https://github.com/cluebotng/component-configs/commits/55401ec37eaad5c0bffbfbdac85eda0a67eed5c0)
* 21:10 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25885710134 (https://github.com/cluebotng/component-configs/commits/c2d1834d313a784edca516e9f9d4e29fd90a3e5f)
* 21:01 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25885319112 (https://github.com/cluebotng/component-configs/commits/8369d11ffe6281d9c21f2e63e23bd03ac30ca98a)
* 18:29 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25877799217 (https://github.com/cluebotng/component-configs/commits/df1444522160198ee3ab31965936679052be618b)
=== 2026-05-13 ===
* 01:57 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25773326566 (https://github.com/cluebotng/component-configs/commits/7112f7cc4f34aa90e0c7d7db1ee89c2104da8c46)
* 01:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25772564857 (https://github.com/cluebotng/component-configs/commits/a1494790dc79d0332e0f7d6bdc8d2cf36635de8a)
* 01:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25772343768 (https://github.com/cluebotng/component-configs/commits/6b6570012dd25ddae243b54aa3376d2fb9485ec1)
* 01:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25771924892 (https://github.com/cluebotng/component-configs/commits/a8bb5b7a82d6f9fbfe284e624bd8ec03b24b5e0e)
* 01:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25771823348 (https://github.com/cluebotng/component-configs/commits/d3130e631e8e51f86a65880414bcb85c6e35b979)
=== 2026-05-12 ===
* 12:54 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25735536230 (https://github.com/cluebotng/component-configs/commits/245e38970666704881e64dbcdbf137cd7cf4769b)
* 12:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25733759682 (https://github.com/cluebotng/component-configs/commits/d71f9d5a70137a2c686d661ffd2e3ea4ce69115b)
* 11:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25731464158 (https://github.com/cluebotng/component-configs/commits/ae6d4cad7abfa0dd2bde0d04554e35bad6069ed1)
* 09:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25726905831 (https://github.com/cluebotng/component-configs/commits/cf3a002fc1808019d0018f4f38c1076cdbcaa643)
* 09:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25724491283 (https://github.com/cluebotng/component-configs/commits/8291e582c787ae658abde8c26e8bc2ae9dd381fb)
* 08:36 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25722995016 (https://github.com/cluebotng/component-configs/commits/b4212e1e0296be2fd84c620a56c3d744d9cbaf30)
=== 2026-05-09 ===
* 17:49 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25607652745 (https://github.com/cluebotng/component-configs/commits/25ea91ed216c1cca5ca3de2da9abd7ac1337eb86)
=== 2026-05-06 ===
* 21:50 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25462948750 (https://github.com/cluebotng/component-configs/commits/78047ed0bf8def9f4ff2b181d1710a274328a5b3)
* 21:38 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25462408819 (https://github.com/cluebotng/component-configs/commits/c905103b2bd6129e76da6cf191567f82913c8349)
* 20:57 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25460509016 (https://github.com/cluebotng/component-configs/commits/b9c2dde9382e1a3a5502f5218ecc7eaef6b4f9ec)
* 20:54 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25460310332 (https://github.com/cluebotng/component-configs/commits/511feb967af13f77e59803c69bc3447124733524)
* 18:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25452753895 (https://github.com/cluebotng/component-configs/commits/6680b6abb8b5b775f79893981bd070c9e17f0358)
* 18:03 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25452220343 (https://github.com/cluebotng/component-configs/commits/2ba80dd76efd2798ab7d1fedddf2a7baeab1118b)
* 17:50 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25451592437 (https://github.com/cluebotng/component-configs/commits/88790d34de5a0bca24d8685f3f7df749b052647b)
* 16:48 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25448624822 (https://github.com/cluebotng/component-configs/commits/92c52e117c73389fe04c66d53ad083eef60ef387)
* 16:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25448365994 (https://github.com/cluebotng/component-configs/commits/7eaf56e56e3eacfd97455f271b152f369ca2b3f4)
=== 2026-04-25 ===
* 02:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24919967572 (https://github.com/cluebotng/component-configs/commits/3940629dd68ec9bd151f28eece25db46955ccec8)
* 01:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24919765830 (https://github.com/cluebotng/component-configs/commits/a524bd86e0470289dfab9c755b32ffa9a8f71a42)
* 01:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24918981203 (https://github.com/cluebotng/component-configs/commits/4aec1980bbdc5a4fd34c73a044f672bb26a5af80)
=== 2026-04-24 ===
* 22:13 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24914156754 (https://github.com/cluebotng/component-configs/commits/693acb1af44127a03eaa151ac35b7894c69ab163)
* 20:48 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24911063126 (https://github.com/cluebotng/component-configs/commits/308ad964ff269a70a7aabfc1cb36e924360806a0)
=== 2026-04-17 ===
* 02:51 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24545107848 (https://github.com/cluebotng/component-configs/commits/93640977d7dd4117faf49bd179aac37720eee5b2)
* 01:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24543036687 (https://github.com/cluebotng/component-configs/commits/b97517784b752f7d0055837db122456017b21866)
=== 2026-04-16 ===
* 23:45 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24539805608 (https://github.com/cluebotng/component-configs/commits/bef895ecceca2ac8c0bafe1e39e4680142771a39)
* 21:58 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24536002079 (https://github.com/cluebotng/component-configs/commits/46f94a272d26bcd13cf55d2a223ba74fde716d4e)
* 21:38 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24535245433 (https://github.com/cluebotng/component-configs/commits/6b66d29ba7f99387d0fed0292e21c235b3f9ab0b)
* 12:42 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24510746155 (https://github.com/cluebotng/component-configs/commits/e927788672bbd71dc8a52dc4b34db54abe3995e9)
* 11:09 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24506838480 (https://github.com/cluebotng/component-configs/commits/db30680b466b66aed23b2f87989cbe2e9fd90511)
* 09:38 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24502989226 (https://github.com/cluebotng/component-configs/commits/d1caa7e619b2f6df59cbdb32d5bdf150ceacfa19)
=== 2026-04-14 ===
* 21:58 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24424850045 (https://github.com/cluebotng/component-configs/commits/9124b4b266ce71985cca6d82fd4261c1850b6d4d)
* 21:03 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24422702845 (https://github.com/cluebotng/component-configs/commits/6d3be4a356e5c50103fa3381dc7c9ce81bf1788f)
=== 2026-04-10 ===
* 15:26 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24250442419 (https://github.com/cluebotng/component-configs/commits/bfa8b761a017e9b8bb69ae52c5cb731d17bd324f)
* 15:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24249897960 (https://github.com/cluebotng/component-configs/commits/68514222ba9a90ece524baf75b02c9835faf87d3)
* 14:25 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24247609205 (https://github.com/cluebotng/component-configs/commits/51257ea555ac174e0fe397b1923edbf76fada3dc)
* 14:24 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24247598064 (https://github.com/cluebotng/component-configs/commits/7ec9cb1bdb9d3218d0388086118e3609ccb68956)
=== 2026-04-09 ===
* 18:36 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24206712245 (https://github.com/cluebotng/component-configs/commits/e7d5ec988541b9d441a5c565f624b7e88e11204f)
* 18:17 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24206093215 (https://github.com/cluebotng/component-configs/commits/a97bfe791582e24f1c696f1bd89b965ea233c253)
=== 2026-03-27 ===
* 17:44 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/23659587521 (https://github.com/cluebotng/component-configs/commits/6d62fa6482ab7ce2ff8a99029633d738f068af76)
* 17:42 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23659545398 (https://github.com/cluebotng/component-configs/commits/8bb9fa3878bfe2763bc9fc12643727602969ae50)
=== 2026-03-21 ===
* 16:22 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23383640353 (https://github.com/cluebotng/component-configs/commits/3497a25c3d209bdf8f64f3ec3e77e52f2f8debfa)
* 16:21 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23383563872 (https://github.com/cluebotng/component-configs/commits/6bbc2343b2b24ab7149ad6471504f80a9d529d11)
* 16:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23383560207 (https://github.com/cluebotng/component-configs/commits/86937ba3c912f82970faef09c876b888834d96b2)
=== 2025-12-25 ===
* 13:43 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/20505969753 (https://github.com/cluebotng/component-configs/commits/5a347e229af9bc830ef9a03e0446b46dd8ccfb15)
=== 2025-12-01 ===
* 08:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19816470332 (https://github.com/cluebotng/component-configs/commits/00278e339c41812ca8ecd179e1630abfb031117b)
* 08:43 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19816470332 (https://github.com/cluebotng/component-configs/commits/00278e339c41812ca8ecd179e1630abfb031117b)
=== 2025-11-12 ===
* 07:29 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19289723947 (https://github.com/cluebotng/component-configs/commits/6ad3fbf7ef1281dfed5868d2596d347e01131d18)
=== 2025-11-11 ===
* 15:41 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19270642940 (https://github.com/cluebotng/component-configs/commits/3fe913812986e82db75d4a6657cba3f697f5649c)
* 15:36 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19270642940 (https://github.com/cluebotng/component-configs/commits/3fe913812986e82db75d4a6657cba3f697f5649c)
* 15:31 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19270294471 (https://github.com/cluebotng/component-configs/commits/df4e433c6a567df4484b7115ddf2c53fe1f9494f)
* 15:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19270285003 (https://github.com/cluebotng/component-configs/commits/e103f6ac56b26a2d6e3c0705c81c75a4419287cc)
* 14:40 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19268985739 (https://github.com/cluebotng/component-configs/commits/f88bf173399c2591eca2357bbc9bff54ee70b731)
=== 2025-11-09 ===
* 19:59 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19213633185 (https://github.com/cluebotng/component-configs/commits/f94671068b275a0195e1001943ca83f40b8d81f2)
* 19:24 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19213259803 (https://github.com/cluebotng/component-configs/commits/65af33a993b42c6821f5053c548a142d5b11a42d)
=== 2025-11-04 ===
* 11:05 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19066512964 (https://github.com/cluebotng/component-configs/commits/38bbff0654c639bfe723e48347ec173abc2a3c96)
=== 2025-11-02 ===
* 18:25 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19016297587 (https://github.com/cluebotng/component-configs/commits/7620c10586a5803b6d797aac0fcee5303142d049)
* 18:13 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19016164140 (https://github.com/cluebotng/component-configs/commits/042c1033922105af1510ad83356d090383abb809)
* 17:54 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19015945300 (https://github.com/cluebotng/component-configs/commits/b09ca1fd5dc19977fb34f5f85107e861e53587e7)
* 17:49 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19015905691 (https://github.com/cluebotng/component-configs/commits/3cb78b899eadbb2f6060ea355e89b4c15be3919e)
* 17:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19015890628 (https://github.com/cluebotng/component-configs/commits/9a7a2fa73af31eda4276cabe550fb40bf7c173a5)
* 17:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19015435727 (https://github.com/cluebotng/component-configs/commits/e834d58735fb8e7d28a81f91d23508869c73d1c5)
=== 2025-09-24 ===
* 17:57 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17985198507 (https://github.com/cluebotng/component-configs/commits/cfa2541734b05a9da326bbeab2e82cc21d6e91e4)
* 17:40 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17984820840 (https://github.com/cluebotng/component-configs/commits/6f47ae931d95d85e2c3c1d6b42f1eabc6d3b1960)
* 17:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17984009139 (https://github.com/cluebotng/component-configs/commits/refs/heads/main)
* 16:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17983743046 (https://github.com/cluebotng/component-configs/commits/refs/heads/main)
=== 2025-09-22 ===
* 18:08 wmbot~damian-scripts@tools-bastion-15: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17924276736
<noinclude>[[Category:SAL]]</noinclude>
aoqtf7f3wal1og2ap5g2hep4d01dvae
Help talk:Toolforge/My first static tool
13
460361
2428861
2026-06-20T13:46:27Z
Dw31415
50147
/* How to update the webservice */ new section
2428861
wikitext
text/x-wiki
== How to update the webservice ==
How do I update the webservice after the code changes? I can't find that. [[User:Dw31415|Dw31415]] ([[User talk:Dw31415|talk]]) 13:46, 20 June 2026 (UTC)
mpgk10v8ptr72unmcvew6q0sdanfhro